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ABSTRACT 


t  esults  of  a  comprehensive  research  program  to  develop 
efficient  transform  image  coding  algorithms  are  reported  in  this 
dissertation.  The  objective  is  to  develop  algorithms  that  outperform 
the  conventional  block-encoding  procedures,  i.e.,  achieve  data  rates 
below  the  one  bit/picture  element  which  is  the  approximate  lower 
limit  for  conventional  transform  coders. 

The  dissertation  includes  a  detailed  analysis  of  image  modeling 
aspects  of  the  transform  coding  problem.  Two  alternate  prediction 
algorithms  are  analyzed  for  the  transform  sample  variance  estima¬ 
tion;  the  first  technique  uses  a  two-dimensional  polynomial  to  model 
the  image  power  spectral  density;  the  second  technique  is  a  simple 
recursive  approach  based  on  previously  quantized  values.  The 
actual  coding  algorithms  utilize  the  latter  approach. 

The  generalized  phase  concept  is  developed  and  plays  a  vital 
role  in  the  coding  algorithms.  Both  the  Fourier  and  Walsh  trans¬ 
forms  are  utilized,  the  former  being  demonstrated  to  have  superior 
performance.  A  non-negative  image  constraint  is  explored  via  the 
Lukosz  bound. 

The  experimental  phase  of  the  study  includes  two  dimensional 
coding  of  monochrome,  and  three  dimensional  coding  of  color,  as 
well  as  interframe  images  with  coding  at  0.  38,  0.  55,  and  0.25  bits 
per  pixel,  respectively.  It  is  ascertained  that  decoded  and  recon¬ 
structed  images  are  not  significantly  degraded.  It  is  also  demonstra¬ 
ted  that  adaptive  transform  domain  modeling  is  important,  and  that 

ii 


large-size  transforms,  in  conjunction  with  the  proper  image  model, 
can  significantly  outperform  block-encoding  techniques. 

A  requirement  for  large-size  transforms  can  easily  discourage 
hardwired  usage.  Techniques  can  be  developed,  however,  that  could 
advantageously  be  employed  for  computer -to -computer  image 
transfer. 

Although  the  new  coding -decoding  methods  are  sensitive  to 
channel  errors,  it  is  demonstrated  that  they  produce  data  which  are 
statistically  equivalent  to  a  discrete  memoryless  source.  Thus, 
conventional  channel  coding  techniques  can  be  used. 
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I.  INTRODUCTION 


The  human  visual  system  can  absorb  and  evaluate  vast  amounts 
of  pictorial  information.  The  i  nge  of  the  visual  data  includes  many 
different  classes  such  as  graphics,  biomedical  images  or  aerial 
photographs.  The  human  eye  responds  to  color  as  well  as  intensity; 
consequently  the  general  description  of  an  image  also  contains  spec¬ 
tral  information.  If  the  time  history  of  the  image  is  to  be  character¬ 
ized,  the  dimensionality  of  the  description  is  further  increased. 

Mathematically,  an  image  can  be  represented  by  a  function  of 
four  variables,  I  =  I(x,  y,  t,  X).  The  spatial  coordinates  are  x,  y, 
the  variable  t  represents  time  and  X  is  the  wavelength  representing  a 
particular  spectral  component.  The  I  represents  the  energy  to  which 
the  eye  as  a  photoelectric  detector  responds.  The  energy  is  a  non¬ 
negative  quantity;  consequently,  the  following  constraint  must  be 
satisfied  for  an  image 


I(x,  y,  t,  X)  £  0 

This  simple  non -negative  constraint  introduces  various  addi¬ 
tional  constraints  for  image  sampling  and  filtering. 

This  dissertation  is  devoted  to  an  adaptive  technique  of  image 
coding.  In  terms  of  the  definition  of  an  image,  image  coding  is 
specified  as  a  process  by  which  the  analog  image  function  I  is  repre¬ 
sented  as  a  sequence  of  binary  digits.  Clearly,  the  binary  represen¬ 


tation  must  be  unique  and  inver table  for  a  given  coder.  The  relative 


2 


efficiency  cf  image  coders  can  be  directly  compared  in  terms  of  the 
binary  digit  sequence  length  generated  to  characterize  a  given  image. 
1.  1  Review  of  Coding  Objectives,  Techniques,  Results 

Although  the  primary  objective  of  image  coding  has  been 
communication  bandwidth  reduction  for  pictorial  data,  there  are  addi¬ 
tional  equally  important  considerations.  The  general  availability  of 
increasingly  powerful  digital  computers  has  permitted  numerical 
implementations  of  many  image  operations.  The  degrees  of  freedom 
in  a  typical  image  are  quite  large;  consequently,  the  storage  and 
access  of  pictorial  data  itself  represents  a  significant  problem. 

The  definition  of  image  coding  given  on  page  1  is  essentially  a 
source  coding  process.  A  schematic  of  the  simplified  communication 
system  is  given  in  Figure  1.  1-1. 

It  is  the  source  encoding /decoding  which  is  relevant  to  the 
nature  of  pictorial  information.  Specifically,  an  efficient  source 
coding  process  will  utilize  the  statistics  and  dimensionality  (space, 
time,  and  color,  as  previously  indicated)  of  the  pictorial  data.  The 
conversion  or  the  analog  image  into  a  binary  stream  involves  various 
distinct  steps  which  may  include  an  analog  ore-  or  two-dimensional 
prefilter,  sampler,  quantizer,  digital  preprocessor,  and  statistical 
encoder.  All  of  these  operations  are  largely  determined  by  the 
nature  of  the  source. 

The  channel  encoding /decoding,  unlike  source  encoding/decod¬ 
ing,  should  be  insensitive  to  the  original  character  of  the  data. 


Figure  1.  1-1.  Classical  Communication  Model 
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Although  consideration  of  the  channel  and  related  parameters  is  im¬ 
portant  in  the  overall  communication  problem,  the  relevant  encoding/ 
decoding  process  is  not  unique  to  the  nature  of  pictorial  data.  It  can 
be  expected  that  the  source  encoding  process  will  produce  a  sequence 
of  binary  digits  which  are  statistically  equivalent  to  a  set  produced 
by  a  memoryless  discrete  source.  Classical  channel  encoding  tech¬ 
niques  should,  therefore,  be  applicable  to  the  source  coded  image 
data  without  specific  reference  to  the  pictorial  nature  of  the  informa¬ 
tion. 

Although  the  sampling  and  quantizing  process  should  be  con¬ 
sidered  as  an  integral  part  of  picture  coding,  it  is  rarely  done.  Con¬ 
ventionally,  the  input  to  the  coder  is  a  sampled  and  quantized  image 
which  the  coding  algorithm  will  process  such  that  its  output  consists 
of  a  reduced  number  of  binary  digits.  The  conventional  sampling  is 
performed  over  a  rectangular  grid  and  the  analog  samples  are  quan¬ 
tified  to  64-256  quantum  levels.  A  picture  coding  algorithm  reduces 
the  source  rate,  or  equivalently  the  transmission  bandwidth  require¬ 
ment,  by  reducing  the  number  of  samples  and/or  reducing  the  number 
of  quantum  levels. 

The  well-known  and  accepted  technique  of  differential  pulse 
code  modulation  (DPCM)  reduces  the  number  of  quantum  levels  with¬ 
out  sample  reduction  (Cutler,  1952;  Graham,  1 95 8 ;  O'Neal,  1966). 
DPCM  achieves  the  rate  reduction  by  encoding  sample  differences 
rather  than  the  samples  themselves.  Many  different  categories  exist 
for  this  coding  technique.  Compared  with  the  8-bit  conventional 


PCM  code,  a  well-designed  DPCM  system  can  achieve  a  factor  of 
three  rate  reduction,  2.5.3  bit  per  original  picture  element. 

The  various  algorithms  that  decompose  the  image  or  its  derive- 
lives  into  contours  can  achieve  significant  rate  reduction  for  specific 
types  of  images,  namely,  the  ones  that  can  be  described  by  a  few 
number  of  contours.  The  disadvantage  of  this  technique  is  the  high 
degree  of  computational  complexity  and  large  buffer  requirement 
(Graham,  .967,.  This  requirement  is  that  the  entire  image  must  be 
simultaneously  available  to  the  processing  algorithm.  Contour 
tracing  algorithms  have  been  adapted  to  frame -to -frame  image 
coding  (Habibi.  1,73,.  In  this  case,  the  frame-, o-frame  image 
difference  is  subjected  to  the  coding  algorithm.  The  receiver,  upon 
decoding  the  difference  image,  updates  the  previous  frame.  Frame- 
to -frame  coders  of  this  type  can  achieve  a  rate  of  one  bit. 

Coders  that  adapt  to  the  local  statistics  of  the  image  can  achieve 
additional  rate  reduction  over  nonadaptive  algorithms.  The  dual  coder 
is  an  example  of  this  technique  (Frei,  Schindler,  and  Veitinger,  1,72|. 

In  this  case,  the  sampling  rate  is  changed  according  the  amount  of 
local  picture  detail. 

s  stated  earlier,  the  general  image  representation  requires 
four  dimensions,  two  for  space,  one  for  time,  and  one  for  color. 

Most  coding  techniques  consider  only  monochrome  images.  Only 
recently  has  color  coding  acquired  more  attention  (Bhushan,  1,70; 

Pratt,  1,71).  Use  of  frame-to.frame  redundancy  in  images  is  another 
research  topic  which  has  not  been  extensively  explored. 
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1.2  Transform  Techniques  for  Image  Coding 

Most  classical  spatial  domain  coding  techniques  (contour  coding 
is  the  exception)  generate  code  words  based  on  the  original  picture 
elements  (PEL)  through  a  one-to-one  mapping.  In  other  words,  the 
bandwidth  reduction  is  achieved  by  requantization.  Although  the 
mapping  is  one-to-one,  inter -element  correlations  are  often  utilized 
by  the  coding  algorithm  (Habibi,  1971).  What  is  fundamentally  dif¬ 
ferent  for  transform  coding  is  that  part  or  all  of  the  image  is  trans¬ 
formed  into  another  domain  via  an  invertable  mapping.  The  sample 
reduction  and  requantization  are  performed  on  the  transformed 
values  and  the  resultant  code  words  are  then  transmitted  through  the 
channel.  The  receiver  will  attempt  to  reconstruct  the  original  image 
utilizing  the  inverse  of  the  transform  upon  receiving  the  appropriate 
code  words. 

Numerous  techniques  have  been  developed  for  transform  coding 
over  the  last  five  years  (Wintz,  1972).  Although  practical  ranking 
cannot  be  made,  many  of  these  techniques  result  in  data  rates  as  low 
as  1  bit/pel.  The  theoretical  justification  and  motivation  behind 
transform  coding  has  been  rather  varied.  Transform  coding  has  been 
analyzed  essentially  by  statistical  tools.  One  basic  motivation  has 
been  sample  reduction.  The  "useful"  transforms  have  the  property 
that  most  of  the  image  energy  is  concentrated  in  relatively  few  trans¬ 
form  samples.  Stating  it  differently,  many  transform  samples  have 
very  small  amplitudes  and  can  therefore  be  discarded  without  being 
transmitted  through  the  channel. 
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The  Karhunen-Loeve  (K-L)  transform  is  the  optimum  transform 
for  images  describable  by  second  orde*  statistics  (Thomas,  1968).  It 
has  been  shown  that  for  correlated  Gaussian  sources  the  optimum 
quantizer  will  uncorrelate  the  samples  via  the  K-L  decomposition  and 
the  bit  rate  is  determined  in  proportion  to  the  transformed  variance 
samples.  The  K-L  transform,  by  definition,  diagonalizes  the  image 
covariance  function.  The  diagonal  terms  are  the  eigenvalues  and  are 
ordered  in  decreasing  magnitude. 

The  K-L  transform  is  almost  synonymous  with  optimum  image 
coding,  and  sometimes  the  relevant  assumptions  are  neglected.  In 
the  practical  sense,  K-L  transform  has  somewhat  less  universal 
utility.  Even  theoretically,  the  K-L  transform  is  optimum  in  the 
mean-square  error  sense  and  only  through  second  order  statistics. 

For  a  correlated  Gaussian  source,  the  optimality  is  achieved  in  fact. 
Practical  image  sources  are  not  Gaussian  and  have  higher  than  second 
order  moments  which  cannot  be  derived  from  the  first  two. 

Ihe  lack  of  availability  of  the  covariance  function  is  another 
difficulty.  There  are  two  fundamental  questions  to  be  analyzed: 

(1)  How  meaningful  is  the  concept  of  covariance  to  images?  Stated  in 
another  way:  is  image  covariance  a  valid  statistical  concept  for 
images  which  are  likely  to  be  nonstationary?  (2)  If  we  ignore  the 
first  question,  how  will  the  functional  form  of  the  covariance  function 
be  determined? 

Question  number  one  is,  in  fact,  ignored  in  practice;  and  the 
perhaps  oversimplified  statement  can  be  offered  that  because  of  the 
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lack  of  better  statistical  understanding  of  images,  no  better  param¬ 
eter  has  yet  been  offered. 

The  approximation  of  the  covariance  function  or  its  transform 
domain  equivalent  can  be  done  either  numerically  or  by  a  closed 
form  function.  For  the  first  case,  one  can  directly  determine  the 
transform  sample  variances  experimentally  and  make  the  bit- 
assignment  accordingly.  An  example  for  the  functional  form  is: 

exp  (_Qj  x  |  -3  j  y  | ) 

This  simple  experimental  form  has  been  used  successfully  in  spite 
of  its  gross  simplicity  (Habibi  and  Wintz,  1972).  The  parameters 
a  and  3  represent  the  horizontal  and  vertical  correlation,  and 
directional  separability  of  these  principal  axes  is  assumed.  The 
exponential  form  of  the  covariance  function  is  attractive.  It  is 
simple  and  the  parameters  o'  and  3  are  easily  estimated. 

A  small  number  of  statistical  parameters  is  desirable  in  any 
coding  scheme.  Since  both  the  receiver  and  transmitter  must  know 
these  parameters,  their  transmission  may  require  non -negligible 
bandwidth  and  should  be  considered  as  part  of  the  overall  bit  rate. 

The  separable  form  of  the  covariance  function,  although  not 
necessarily  characteristic  of  actual  image  fields  themselves,  has 
served  a  useful  purpose. 

Let 


R(x,y)  =  R  (x)  R  (y) 
x  y 
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be  the  covariance  function.  Let  T  represent  the  transformation 
operator,  then,  symbolically,  if  T  =  T  T 

J  v  v 

y 

T  j  R(x,  y)  |  =  Tx{Rx(x)|Ty|Ry(y)}  =  Sx(u)Sy(v) 

where 


TxlRx(x»l  Sx|u) 


TyiRy(y)l  =Sy(v) 

The  generalized  power  spectral  densities  Sx>Sy  should  decrease 
for  increasing  values  of  the  transform  domain  coordinates,  u,  v  if  the 
transform  operations  are  to  be  useful  for  image  coding.  This  fact  is 
achieved  by  the  proper  choice  of  the  transform  operator  T.  The  bit 
assignment  is  proportional  to  log  Sx(u)  +  log  Sy(v).  The  clear  im¬ 
plication  is  that  the  principal  axis  in  the  transform  plane  (e.g.  ,  when 
either  u  or  v  is  zero)  will  receive  a  relatively  large  fraction  of  the 
available  bits.  Most  transform  image  coding  techniques  operate 
on  adjacent  sub-blocks  rather  than  the  whole  image  itself.  The 
separable  covariance  function  results  in  effective  superior  recovery 
of  the  horizontal  and  vertical  image  structure.  The  block  boundaries, 
however,  become  an  integral  part  of  the  image  statistics  and  their 
objectionable  visual  appearance  is  greatly  diminished  by  the  utiliza¬ 
tion  of  the  separable  covariance  model.  On  the  other  hand,  at  very 
low  bit  rates,  excess  amount  of  the  bandwidth  may  be  required  to 
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maintain  the  horizontal  and  vertical  structure  at  the  expense  of  a 
greater  amount  of  resolution  loss  than  may  be  justified. 

The  choice  of  the  actual  transforms  have  been  dictated  by  the 
requirement  of  computational  ease  and  the  potential  of  practical 
implementation.  The  transformations,  considered  to  date,  are 
Fourier,  Hadamard,  K-L,  (on  sub -blocks)  and  more  recently,  the 
Slant  transform.  (Habibi  and  Wintz,  1971;  Anderson  and  Huang, 
1971;  Pratt,  1972;  Pratt,  Welch  and  Chen,  1972.)  All  of  these 
except  K-L  can  be  implemented  by  "fast"  algorithms. 

1.3  Research  Objectives 

The  amount  of  visual  data  generated  in  commercial  and 
scientific  applications  is  enormous.  The  ordinary  home  television 
set  generates  over  500  x  500  samples  30  times  a  second.  The  Earth 
Resources  Technology  Satellites  and  weather  satellites  typically 
produce  in  excess  of  4000  x  4000  and  8000  x  8000  data  points,  re¬ 
spectively.  Data  storage  and  transmission  becomes  a  major  prob¬ 
lem  for  pictorial  sources  because  of  the  excessive  amount  of  data. 
Clearly,  techniques  that  permit  greater  efficiency  (e.  g.  ,  reduction 
in  the  required  bandwidth)  are  urgently  needed. 

Numerous  approaches  have  been  considered  for  efficient  pic¬ 
ture  coding.  While  these  techniques  are  based  on  widely  different 
considerations,  they  are  all  motivated  by  the  required  simplicity  of 
potential  implementation.  Consequently,  the  developed  algorithms 
are  relatively  simple,  utilize  simple  models,  and  are  somewhat 
inflexible  in  terms  of  their  adaptivity  to  the  image  structure. 
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The  philosophy  on  which  this  dissertation  is  based,  emphasizes 
flexibility  and  maximum  efficiency  at  the  possible  expense  of 
increased  computation  and  buffer  requirements.  Demonstration  of 
a  highly  efficient  coding  scheme,  even  if  impractical  for  actual 
implementation,  provides  a  new  lower -bound  for  other  bandwidth 
reduction  schemes.  Secondly,  even  if  the  implementation  of  a 
"hard -wired"  configuration  of  the  particular  algorithm  is  not 
warranted,  it  may  be  valuable  in  the  computer-to-computer 
communication  environment. 

Development  of  computer  networks  whose  individual  computer 
members  may  be  separated  by  vast  geographical  distances  is  a 
modern  concept  which  allows  higher  utilization  of  the  modern  "super" 
computers.  The  Defense  Advanced  Research  Projects  Agency 
(DARPA)  of  the  U.S.  Department  of  Defense  network  is  an  operational 
example,  other  networks  are  likely  to  follow.  By  design,  a  large 
scale  computer  network  can  perform  arithmetic  operations  inexpen¬ 
sively.  The  data  transmission,  however,  remains  a  relatively 
important  cost  factor.  Image  manipulation  within  the  network  will 
probably  be  expensive  because  of  the  requirement  for  large -volume 
data  transmission.  On  the  other  hand,  implementation  of  arithmeti¬ 
cally  complex  coding/decoding  algorithms  may  be  easily  programmed 
for  the  local  "host"  computers.  The  extra  amount  of  computation 
may  be  offset  to  a  significant  degree  by  cost  reduction  for  the 


transmission  of  the  visual  data. 
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Most  of  the  picture  coding  techniques  have  considered  mono¬ 
chrome  imagery  only.  Multi.pecfral,  color,  and  frame -to -frame 
coding  requirements  have  been  addressed  only  very  recently  and 
there  is  still  a  great  deal  of  research  needed  in  these  areas. 

Another  objective  of  this  dissertation  is  to  extend  the  concepts 
oeveloped  for  .  nochrome  imagery  to  the  "third  dimension";  specifi¬ 
cally,  color  imagery  and  frame -to -frame  redundancy  are  considered. 
1.4  Overview  of  the  Dissertation 

Description  of  a  research  project  on  adaptive  transform  domain 
coding  is  given  in  this  dissertation.  The  presentation  of  the  objec¬ 
tives,  development,  and  experimental  results  follow  what  the  author 
believes  to  be  a  carefully  developed  logical  presentation  which  is 
summarized  in  this  section. 

Chapter  1  is  the  Introduction  and  as  such  lays  the  groundwork 

for  the  basic  body  of  the  dissertation.  This  chapter  also  places  the 

research  project  into  perspective  relative  to  the  large  amount  of 

research  previously  conducted  in  the  field  of  picture  coding.  The 

primary  objectives  of  the  dissertation  are  also  spelled  out  in  this 
chapter. 


Images  are  a  specific  class  of  signals  and  require  careful  con¬ 
sideration  if  extreme  redundancy  reduction  is  desired.  Chapter  2 
addresses  this  important  point  of  how  images  can  be  modeled  and 
characterized  in  terms  of  statistical  and  deterministic  parameters. 
Generation  of  the  sampled  image  is  considered,  and  important  com- 
parisons  with  the  one -dimensional  classical  sampling  theorem  are 


given.  The  image  model  as  a  statistical  representation  is  also  given. 
A  review  of  the  Fourier  transform  constraints  is  provided.  Errcr 
criteria  and  image  structure  are  briefly  considered.  The  extremely 
important  non-negative  bound  due  to  Lukosz  is  analyzed  as  related  to 
sampling  and  relative  importance  of  amplitude  vs  phase. 

Chapter  3  presents  the  theoretical  basis  for  the  adaptive 
transform  domain  coding  technique.  It  begins  with  the  comparison 
of  source  and  channel  coding  and  consideration  of  schematic  repre¬ 
sentation  of  adaptive  techniques.  Statistical  properties  of  the  Fourier 
and  Walsh  domain  are  analyzed.  Phase  and  amplitude  coding  are 
considered  in  terms  of  quantization,  sampling,  and  relative  amount 
of  information.  Nonlinear  effects  of  phase  quantization  are  consi¬ 
dered.  Relative  importance  of  phase  is  demonstrated  via  nonlinear 
filtering  and  gross  reduction  of  amplitude  information. 

Chapter  4  is  the  first  of  three  chapters  discussing  the  experi¬ 
mental  results.  Monochrome  image  coding  is  considered  in  this 
chapter.  Detailed  discussion  is  given  of  the  following  topics:  the 
algorithm,  preprocessing,  error  analysis.  Comparison  is  made 
with  the  conventional  Markov  model.  Sensitivity  analysis  of  noise 

effects  on  the  coding  algorithm  is  performed.  Pictorial  examples  are 
included. 

Experimental  results  of  color  coding  are  presented  in  Chapter 
5.  This  chapter  briefly  reviews  the  theory  of  color  perception  and 
representation  of  color  images.  Extension  of  the  monochrome  algo¬ 
rithm  of  Chapter  3  is  discussed  and  is  followed  by  pictorial  examples . 


Frame -to -frame  coding  is  considered  in  Chapter  6.  Algorithm 
development  is  discussed  and  implementation  includes  both  the 
Fourier  and  Walsh  transforms.  Pictorial  results  are  provided. 
Unfortunately,  the  actual  visual  performance  of  the  frame -to -frame 
coder  can  only  be  demonstrated  in  a  realistic  time-variant  medium 
such  as  video  presentation. 

Chapter  7  summarizes  the  dissertation. 

Appendix  A  contains  the  original  test  images.  The  numerical 
noise  generation  process  of  the  large  Fourier  transform  is  considered 
in  Appendix  B. 


2.  IMAGE  MODELING 


The  fundamental  objective  of  the  research  project  presented  in 
this  dissertation  is  the  development  of  a  very  efficient  source  coding 
method  for  images.  The  emphasis  is  on  the  efficiency  even  at  the 
expense  of  more  complex  algorithms  and  data  handling.  Clearly,  the 
coder/decoder  process  must  utilize  as  much  a  priori  information  as 
possible.  The  model  should  utilize  both  statistical  and  deterministic 
information. 

This  chapter  addresses  the  role  of  image  modeling  in  the  image 
coding  process. 

2.  1  Generation  of  the  Discrete  Image 

Virtually  all  operations  and  transforms  discussed  in  this  disser¬ 
tation  are  performed  numerically  on  discrete  samples.  It  is  tempting 
to  follow  the  general  approach  to  image  coding  and  restrict  the 
analysis  to  the  discrete  equivalent  of  the  image.  However,  it  should 
be  remembered  that  images  are  generally  viewed  in  analog  form. 

The  discretization  of  the  image  plays  a  fundamental  part  in  the  image 
coding  process.  In  addition  to  the  higher  dimensionality  of  the  prob¬ 
lem,  there  are  very  important  factors  that  distinguish  image  sampling 
from  sampling  of  one -dimensional  time  dependent  signals.  These 
concepts  will  now  be  considered. 

Let  the  image  be  represented  by  I  =  I(x,y),  where  x,  y  are 

spatial  coordinates  and  I  represents  the  analog  image.  The  image  is 

sampled  on  a  square  grid  of  lattice  distance  A.  Let  the  sampled 

image  be  defined  I  . 

s 


The  actual  image  sampling  is  always,  almost  by  definition, 
performed  by  an  optical  system.  The  image  (normally  a  photographic 
transparency,  print,  or  an  actual  scene)  sample  of  location  x,y  is 
imaged  onto  a  photo  detector  whose  output,  ideally,  is  linearly  pro¬ 
portional  to  the  image  brightness  of  that  location. 

The  sample  area  can  be  considered  via  an  aperture  function 
A(x,y).  Typically  A(x,  y)  has  the  value  of  1  in  a  small  region  around 
x,  y  and  0  elsewhere.  Allowing  for  the  finite  aperture  size,  the 
sampled  image  has  the  following  definition. 


I  (x,  y)  =  comb 
s 


(l)c°mb(*) 


I(x,  y)  *  A(x,  y) 


(2. 1-1) 


II C°mb  (jf)  comb  (f)  I(P»  s)  A(x-p,  y-s)  dpds 


where 


comb  (x) 


6(x-n) 


and  6(x)  is  the  Dirac  delta  function. 

Considering  the  Fourier  integral  of  this  equation,  one  obtains 
(Goodman,  1968) 


Ig(u,v)  =  | comb  (Au)  comb  (Av)  *  I(u,  v)  |  A(u,  v)  (2.  1-2) 


The  frequency  domain  coordinates  are  u,  v  and  the  symbol  ~  indicates 
the  Fourier  transform. 
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Equation  (2.  1-2)  is  the  classical  result  of  sampling  theory, 
however,  one  must  be  careful  in  its  interpretation  for  image -related 
applications.  The  image  I  is  always  bandlimited.  Visual  scenes 
have  structure  at  all  levels,  at  the  extreme,  down  to  the  micro  or 
molecular  structure.  Permanent  recordings  do  limit  the  spatial 
frequency  extent  and  therefore  become  bandlimited.  However,  they 
introduce  their  own  characteristic  structure,  for  example,  film 

grain.  The  bandlimiting  is  also  performed  by  the  optical  system 
that  performs  the  imaging. 

The  bracketed  term  in  Equation  2.  1-2  indicates  that  the  funda¬ 
mental  frequency  band  I(u,  v)  is  replicated  at  locations  n(l/A), 
mU/A),  n,  m  =  0,  ±1,  ±  2,  .  .  in  the  frequency  plane.  If  twice  the 
bandlimit  of  I  is  larger  than  the  sampling  rate,  1/A,  the  replicated 
bands  will  partially  overlap  and  undesirable  aliasing  occurs.  The 
aperture  function  A  should  separate  the  fundamental  band  from  its 
replicas.  The  requirement  on  A  in  this  case  is  that 


A(u,  v)  =  l;u,  v  c  [-i-.  A] 

(2.1-3) 

A(u,  v)  =  0  otherwise 

equivalently, 

A(u,v)  =  rect  (Au)  rect  (Av)  (2.  1_4) 


18 


Equation  (2.  1-4)  contradicts  the  physical  constraint  that  the 
aperture  function  must  be  non-negative.  Equation  (2.  1-4)  leads  to 
the  unrealizable  condition  that 

A(x,y)=A  sine  j  sine  ^  (2.1-5) 

where 


sine  x  =  sin  ttx/ttx. 

A  similar  argument  indicates  (a  more  formal  argument  will  be 
presented  under  the  Lukosz  bound  section)  that  an  optical  system  can 
not  perform  the  bandlimiting  without  attenuation  in  the  band  pass. 

On  the  other  hand,  the  minimization  of  the  attenuating  effect  of  the 
optical  system  and/or  the  sampling  aperture  may  lead  to  aliasing. 

2.2  Statistical  Consideration 

The  image  sampling  process  and  the  non -negative  image  con¬ 
straint  are  deterministic  bounds.  There  are  other  descriptive  con¬ 
straints  on  images  which  can  only  be  utilized  through  statistical 
consideration. 

A  wealth  of  knowledge  has  been  developed  in  statistical  commu¬ 
nication  theory  and  related  disciplines  which  can  be  very  useful  in  the 
design  of  image  coding  algorithms.  The  image  can  be  considered  as 
a  sample  function  generated  by  a  stochastic  source.  The  statistics 
of  the  source  may  be  available  or  can  be  estimated  or,  as  is  usually 
done,  calculated  from  the  image  itself.  For  the  latter  case  to  be 
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valid,  ergodicity  for  the  calculated  parameters  usually  must  be 
assumed. 

Knowledge  of  the  image  second  order  statistics  can  provide 
significant  assistance  in  the  development  of  efficient  image  coding 
algorithms.  The  image  correlation  function,  R (x  x^;y ^ ,  y2 )  is 
defined  as 

R (x  1 '  x2 :y  1»  y2 )  *'  (I(xi'yi>  "  I(x1*  (I(x2>y2)  "  I(x2»  y2)} 

(2.2-1) 

The  over-bar  indicated  ensemble  averaging.  I  is  the  image 
which  in  this  case  is  considered  as  a  random  process,  and  I(x.,y.) 
is  a  sample  of  that  process  and  is  considered  as  a  random  variable. 

The  correlation  function  is  usually  estimated  by  involving  the 
ergodicity  argument  and  the  assumption  of  wide  sense  stationarity. 

If  R  can  be  decomposed  into  the  product  of  vertical  and  horizontal 
correlation  functions;  then 

R(x,y)  =  Rx(x)  Ry(y)  (2.2-2) 

The  approximation  of  R^  and  Ry  by  exponential  function  has 
been  utilized  for  coding  (Habibi  and  Wintz,  1971)  as  well  as  filtering 
(Pratt,  1972)  of  images,  and  for  this  case  the  correlation  function  is 
given  by 


R(x,y>  e-aMe-BM 


(2.2-3) 
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Although  the  actual  image  coding  techniques  discussed  in  this 
dissertation  do  not  utilize  Equation  (2.  2-3),  the  discussion  of  this 
structure  is  desirable  since  many  transform  image  coding  algo¬ 
rithms  are  relying  on  the  separable  exponential  form.  In  subsequent 
chapters  comparison  will  be  made  between  such  approaches  to  image 
coding  and  the  new  algorithms  presented  in  this  dissertation.  Some 
of  the  later  analysis  will  require  an  explicit  form  for  the  correlation 
function;  for  example,  the  effect  of  additive  noise  on  the  coder,  and 
the  use  of  the  exponential  form  because  of  its  simplicity. 

One  should  emphasize  that  Equation  (2.2-1)  refers  to  the 
recorded  sampled  image.  The  correlation  properties  of  the  analog 
visual  scene  are  rarely  available  and  can  only  be  inferred  from 
detailed  knowledge  of  the  sampling  parameters. 

The  Fourier  transform  of  Equation  (2.2-1)  is  the  conventional 
definition  of  power  spectral  density,  S(u,  v).  Using  the  aperture 
function  A  of  subsection  2.  1,  it  is  straightforward  to  show  that 

Ss(u,  v)  --|a(u,v)|2  S(u,  v)  (2.2-4) 

here,  the  subscript  s  denotes  the  sampled  version.  As  indicated 
previously,  the  lack  of  precise  knowledge  of  the  sampling  parameters 
does  not  permit  accurate  modeling  of  the  original  image.  The 
structural  form  of  Equation  (2.  2-4)  permits  a  somewhat  different 
interpretation.  The  sampled  image  can  be  considered  as  one  which 
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has  been  processed  by  the  linear  spatial  filter  A(u,  v).  Consequently, 
the  conventional  PCM  code  available  to  the  image  coder  reflects  the 
essentially  low-pass  filtering  effect  of  the  sampling  process. 

2.3  Consideration  of  the  Transform  Domain 

Image  coding  algorithms  generally  operate  on  image  elements 
directly.  The  significant  advances  in  digital  hardware  technology 
stimulated  research  in  a  new  approach  to  image  coding  which  have 
come  to  be  known  as  transform  coding.  In  this  section,  a  short 
overview  is  given  to  the  transform  domain. 

Let  the  image  be  denoted  by  I  as  in  Chapter  1,  I  =  I(x,  y,  \,  t), 
indicating  the  functional  dependence  on  the  spatial  coordinates  (x,  y), 
color  (\)  and  time  (t). 

A  transform  coder  algorithm  operates  in  a  domain  other  than 
the  original  described  by  the  four  parameters:  x,  y,  t.  The 
following  symbolic  representation  can  be  written 

!(ui»  u2»  u3»  u4)  =  T  | I(x,  y ,  \,t)|  (2.3-1) 

T  is  the  operator  which  performs  the  transformation  between  the  two 
domains  and  it  should  be  invertible.  The  latter  requirement  is  due  to 
the  fact  that  without  coding  no  ambiguity  should  be  present  in  the 
image  transformation.  Consequently, 

T(x,y,  \,t)  -  T'^Iiuj,  u2,u3,u4)j  (2.3-2) 

and 
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TT” 1  T_1T  =  T 


(2.3-3) 


where  1  is  the  identity  operator. 

other  than  the  requirement  for  invertability.  T  is  completely 
general.  Specifically,  it  may  be  linear  or  nonlinear.  The  operator 

T  may  be  decomposable  and  it  can  operate  on  the  continuous  analog 
image  or  its  sampled  equivalent. 

The  choice  of  T  is  motivated  by  hope  that  liu, ,  u.,,  U3.  u4)  can 
be  coded  more  efficiently. 

Practical  requirements  restrict  T  to  mathematical  forms 
Which  are  numerically  implementable  without  excessive  computation. 
The  transform  algorithms  which  have  been  successfully  implemented 
can  be  grouped  into  three  classes. 

a)  Karhunen-Loeve  (K-JL)  Transform 

The  image  I  is  expanded  into  the  eigenfunctions  of  the 
image  covariance  matrix.  Although  this  transform  is  important 
from  the  theoretical  viewpoint,  its  practical  value  is  much  less 
significant.  The  difficulties  are  lack  of  '.fast"  implementation,  and 

"  addm°n'  *he  eXaC‘  f°rm  °f  the  covariance  function  usually  is  not 
available.  In  the  presence  of  noise,  the  eigenfunction  expansion 

Will  become  degenerate.  This  is  very  significant  and  has  not  been 
considered  in  the  context  of  image  coding.  The  K-L  transform 
emphasizes  the  second  order  image  statistics.  Its  optimality  is 
achieved  for  Oaussian  processes  which  do  not  closely  represent 
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images  in  general.  The  K-L  transform  assumes  stationarity  which 
is  an  additional  assumption  that  is  rarely  met  for  typical  images. 

b)  Trigonometric  Decomposition  (Fourier  Transform) 

The  image  energy  tends  to  concentrate  for  low  frequen¬ 
cies,  e.  g.  ,  low  values  of  u^,  u^,  u^»  u^.  These  deterministic  and 
statistical  properties  are  useful  to  the  transform  coding  algorithm 
and  will  be  further  considered  in  Chapters  3  and  4.  The  multidimen¬ 
sional  Fourier  transform  is  decomposable  into  a  set  of  one-dimen¬ 
sional  transforms  and  it  can  be  implemented  by  the  "fast"  Fourier 
transform  algorithm.  The  Fourier  domain  is  also  constrained  by 
the  Lukosz  bound  (subsection  2.5). 

c)  Other  Orthogonal  Decompositions 

Transform  coding  haB  also  been  successful  in  utilizing 
various  fast  orthogonal  decompositions.  The  most  well  known  among 
them  is  the  Walsh  transform.  Although,  no  simple  mathematical 
justification  can  be  offered  for  their  successful  utilization,  it  can  be 
shown  that  these  functions  are  "approximately"  trigonometric 
functions. 

The  particular  value  of  the  transforms  under  this 
category  is  their  close  similarity  to  the  Fourier  transform,  however, 
they  are  suboptimal  to  it.  What  is  meant  by  optimality  in  this  case 
is  deferred  to  the  experimental  chapters.  In  spite  of  this  subopti¬ 
mality,  the  non-trigonometric,  orthogonal  function  decomposition 
may  be  preferred  because  of  the  ease  of  numerical  implementation. 


The  Walsh  decomposition  can  be  accomplished  without  multiplication 
or  division,  and,  consequently,  its  digital  implementation  is  superior 
to  that  of  the  Fourier  transform  (Harmuth,  1972);  although,  this  fact 
is  more  significant  for  smaller  computers  without  hardware  floating 
point  multiply  and  divide  registers. 

Equations  (2.3-1)  and  (.3-2)  are  implemented  in  numer¬ 
ical  form;  therefore,  the  discrete  representation  will  be  considered. 

If  T  is  restricted  to  be  a  linear  operator,  these  equations  can  be 
represented  in  (generalized)  matrix  notation. 

***  , 

Ku1,u2,  u3'  u4>  =  £  X)  Z)  £  A(u  ,  u,,  u~,  u  ,  x,  y,  X,  t)  I(x,  y,  X,  t) 

X  y  X  t  *  j  t 

(2.3-4) 

In  all  practical  cases,  the  multidimensional  operator  can 
be  factored  into  a  number  of  operators  equal  to  the  dimension  of  the 
problem.  Let  A  =  and  equivalently  A  =  A^(u,x) 

A2(u2*  y)  A3(u3'  X)  a4(u4»  t). 

Specification  of  A^,  i  =  1,  2,  3,  4  defines  the  transform  and 
the  numerical  implementation.  The  following  well-known  representa¬ 
tion  exists  for  the  discrete  Fourier  transform  (Andrews,  1970) 

A(u,  x)  =  (\/n  j  exp  -  ux 


(2.3-5) 
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Similarly  for  the  Walsh  transform 


n  - 1 

E  Vi 

i-0 

A(u,x)=(/n)  (-1)  (2.3-6) 

N  is  the  order  of  matrix  A(u  ,x).  It  is  arbitrary  for  the 
Fourier  transform  but  restricted  for  the  Walsh  transform  to  values 
2n,  where  n  is  a  positive  integer.  The  variables  x.,u.  in  Equation 
(2.  3-6)  are  the  binary  representation  of  x  and  u  respectively. 

2.4  Error  Criteria 

Between  the  source  and  the  destination,  the  image  is  subjected 
to  significant  processing.  It  is  important  to  note  again  that  the  com¬ 
munication  link  of  Figure  1.  1-1  is  digital  and  the  source,  the  visual 
scene,  is  analog.  It  is  highly  desirable  to  quantify  the  image  degra¬ 
dation  due  to  the  coding  algorithm.  Let  I  be  the  input  to  the  coder 
/\ 

and  I  its  estimate  at  the  destination.  A  measure  of  error,  E  may  be 
schematically  specified  as  a  functional  dependence  G  on  the  difference 

,  /N 

between  I  and  I, 

E=G(I-'l)  (2.4-1) 

with  the  constraint  that  G(0)  =  0. 

Although  the  practical  implementation  of  Equation  (2.4-1)  is 


extremely  useful,  it  is  still  an  unsolved  problem. 


Determination  of  a  useful  error  measure  for  image  evaluation 
is  extremely  difficult  because  even  the  most  approximate  mathema¬ 
tical  modeling  of  the  human  vision  is  available  only  in  limited  cases 
A  conventional  compromise  to  Equation  (2.4-1)  is  the  mean- 
square  error  between  I  and  t  which  can  be  written  in  terms  of  the 
previously-developed  notation  as 


E  =  EE  EE  |l(x,y,  X,t)  -^(x,  y,  X,t)f2  (2.4-3) 

x  y  X  t  '  ' 

The  image  energy,  1^  is  obtained  from  the  above  two  equations  by 
/\ 

letting  1  =  0.  Consequently,  the  normalized  mean  square  error  as 
used  in  Chapters  4  through  6  is  given  by  100  x  E/l  in  percentages. 
2.5  Non-Negative  Bound  (Lukosz) 

The  Fourier  transform  of  a  non-negative  signal  obeys  various 
well-known  constraints.  Perhaps  tue  most  important  is  the  amplitude 
constraint. 


i 
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Let 


G(u) 


/'"g(x)e-2l'j,“  dx 

-00 


(2.5-1) 


if 


g(x)  >  0 


then 

|G(u)|sG(0)  (2.5-2) 

The  inequality  (2.5-2)  is  well  known  (Goodman,  1968).  The 
very  important  extension  of  this  inequality  to  bandlimited  non -nega¬ 
tive  signals  has  unfortunately  been  relegated  to  obscurity.  A  pro¬ 
perly  sampled  image  does  represent  a  non-negative  band-limited 
signal  and  as  such  obeys  the  inequality  discovered  by  W.  Lukosz 
(Lukosz,  1962)  and  is  designated  in  this  dissertation  as  the  Lukosz 
bound. 

In  his  original  paper  Lukosz  was  concerned  with  the  modulation 
transfer  function  properties  of  optical  systems  as  related  to  incoher¬ 
ent  imaging.  The  Fourier  transform  of  the  modulation  transfer 
function  (the  point  source  image)  of  an  optical  system  is  non-negative 
and  has  an  absolute  cutoff  frequency.  Given  this  information,  Lukosz 
intended  to  determine  if  any  additional  constraints  are  applicable 
beyond  Equation  (2.5-2).  Structurally,  the  incoherent  optical 


28 


transfer  function  and  the  Fourier  transform  of  a  band-limited  non¬ 
negative  image  are  equivalent;  that  is  to  say  that  by  definition  they 
satisfy  the  same  requirements.  Consequently,  the  mathematical 
derivation  of  the  Lukosz  bound  is  applicable  to  a  band-limited  image 
as  well  as  to  the  optical  transfer  function. 

The  Lukosz  bound  can  be  derived  for  any  number  of  dimensions. 
The  bound  becomes  stronger  with  increasing  numbers  of  dimensions. 
The  mathematical  derivation  of  this  bound  will  be  demonstrated  in 
this  section.  For  derivation  of  the  two-dimensional  case,  the  reader 
is  referred  to  the  original  Lukosz  paper. 

Consider  the  Fourier  transform  paid  as  in  Equation  (2.6-1), 
with  the  additional  constraint: 

G(u)  =  0,  for  u  ^  um  (2.5-3) 

where  um  is  the  cutoff  frequency.  Note  also  that  Equation  (2.5-2)  is 
already  applicable. 

Let  h(x)  be  another  non-negative  function,  not  restricted  to 
be  band-limited.  Clearly,  the  convolution  of  h  and  g  is  also  non¬ 
negative. 


h  *  g 


g(x  -  s)  ds  :>  0 


(2.5-4) 


29 


Assume  h  to  be  Fourier  transformable, 

H(u)  f~  h(x)  e-2n-iux  dx 

•Coo 

Furthermore,  it  is  easy  to  show  that  h  *  g  satisfies  Equation 
(2.  5-4)  by  utilizing  the  Fourier  transform  properties  of  the  convolu¬ 
tion  integral.  The  previous  statements  become  even  more  obvious  in 
the  framework  of  linear  system  theory,  as  Lukosz  argued,  where  g 
represents  a  low-pass  filter  function  and  h  is  the  input  signal.  How¬ 
ever,  the  specific  physical  argument,  while  intuitively  satisfying,  is 
unnecessary  to  the  mathematical  derivation. 

Let  h(x)  be  the  Dirac  comb  function,  comb  x/L,  as  defined 
previously  in  Equation  (2.  1-1).  The  comb  x/L  is  a  periodic  function, 
where  the  period  is  L.  Therefore,  a  Fourier  series  representation 
cf  comb  x/L  exists,  and  it  is  (see  also  Figure  2.  5-1  for  the  graph¬ 
ical  demonstration): 


00 

comb  r  =  1  +  2  ^  cos  2nnx/ L  (2.  5-5) 

n  =  l 


Let 


G(u)  =  |  G(u)  exp  j6(u), 


1/Lsu  /2 
m 


and  let 
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then  the  convolution  integral  will  preserve  only  the  n  -  1  term  in 
Equation  (2.  5-5): 

h  g  =  G(0)  +  2  |C(1  /l|  cos  |2tt  x/L  +  0(1  /L) J  (2.  5-6) 

Clearly,  the  inequality  (2.5-2)  is  not  sufficient  to  prevent  the  viola¬ 
tion  of  inequality  (2.  5-4).  The  additional  constraint  must  be  imposed 
that 

|  G(l/L)|  £  G(0)  for  1/L  >  u  /2  (2.  5-7) 

Equation  (2.5-7)  is,  in  fact,  the  Lukosz  bound  for  the  region 
um/ 2  S  u  <  um*  The  derivation  of  other  segments  is  based  on 
choosing  appropriate  forms  for  h(x).  Specifically,  let  h(x)  have  the 
following  form 

h(x)  -  j  |comb  1/8  +  comb  | 

Equation  (2.5-8)  has  the  following  Fourier  series  representation 
(see  again  Figure  2.  5-2  for  graphical  demonstration). 

oo 

h(x)  =1+2  cos  ?■  n  cos  2rrnx/L 
m=l 

oo 

=  1  +  /Fcos  (2tt  x)/L  +  2  ^  cos  %  cos  2nL  x/L 

n  3  “* 


(2.5-8) 


For  1/L  £  u  /3,  the  form  of  h  *  g  is 
m 

h  *  g  =  G(0)  +  /F|g(1/L)|  cos  |2nx/L  +  0(1/L){  (2.5-9) 

Since  h  *  g  must  not  be  negative, 

|G(1/L)|  £  1//2  G(0)  for  1/L  s  um/3  (2.  5-10) 

Inequality  expression  (2,5-10)  provides  the  next  section  of  the 

Lukosz  bound,  namely,  u  /3  £  u  <  u  / 2.  It  is  equally  valid  for 

’  m  m 

u  /2  s  u  <  u  ,  but  it  is  weaker  than  (2.5-7),  therefore,  not  useful 
m  m 

for  that  region. 

The  general  form  of  the  non-negative  bound  is  obtained  by 
choosing  more  complicated  forms  for  h(x).  The  general  inequality 
is  the  following 


(2.5-11) 

and  it  is  demonstrated  in  Figure  2.  5-3.  The  argument  u  in  in¬ 
equality  (2.5-11)  is  equivalent  to  1/L  in  inequalities  (2.5-7)  and 
(2.5-10). 

Inequality  (2.5-11)  is  the  Lukosz  bound  for  one -dimensional, 
non-negative,  band-limited  signals.  Its  extension  to  higher  dimen¬ 
sions  can  easily  be  obtained  by  successive  Fourier  decomposition 
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of  the  various  dimensions.  As  previously  stated,  only  the  results 
will  be  given  here. 

Let  g  be  non-negative,  have  two-dimensional  Fourier  trans¬ 
forms,  G,  and  have  band  limits,  u  ,  v  • 

m  m 

G (u,  v)  =  JJ  g(x,y)  e-2nj(xu  T  yv)  dx  dy  (2.5-12) 

-CO 

G(u,  v)  =  0,  for  u  >  u  or  v  5  v  (2  5-13) 

m  m  '  ' 

The  functional  form  of  the  inequality  for  G  is  (Figure  2.  5-4): 
j||G(u,  v)|  i  |  G(-u,  V)J|  --i{|G(u,v)j  +  |g  (u,  -v)|  | 

S  G(0,  0)  COS  —2—  COS  - 

_ _ _ _  n  +  1  m  +  1 

(2.5-14) 

for 

u  /n  £  u  s  u  /(n-1) 
m  m  ' 

and 

v  /n  <:  v  £  v  /(n-1) 

The  actual  derivation  (Lukosz,  1962)  is  straightforward 

although  somewhat  involved.  By  letting  G(u,  v)  =  G  (u)G  (v)  and 

u  v 
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applying  the  one-dimensional  bound  to  and  individually  the 
validity  of  Equation  (2.5-14)  was  demonstrated  by  Lukosz.  Note 
if  G  has  directional  symmetry,  then 

i||G(u,v)|  +  |G(-u,v)|j  -  |-||G(u,v)|  +|g(u,  -v)||  |g(u,v)| 

(2.5-15) 

Equation  (2.  5-15)  can  easily  be  proven  by  the  well-known  property  of 
the  Fourier  transform  of  real  functions  in  which  G  obeys: 

G(u,  v)  =  G  ( -u,  - v )  (2.5-16) 

It  easily  follows  that 

|G(u,  v) J  =  |g(-u,  -v)f  (2.5-17) 

and 

|g(-u,  v)|  ::|  G(u,  -v)  j  (2.5-18) 

It  can  easily  be  shown  via  Equation  (2.5-18)  that  if  Jg|  is  symmetric 
around  the  u  axis,  it  has  symmetry  around  the  other  axis  as  well. 

Before  proceeding  to  the  derivation  of  additional  constraints 
based  on  inequality  (2.5-14),  a  few  general  comments  on  the  impor¬ 
tance  of  this  inequality  are  in  order. 

The  Lukosz  bound  restricts  the  amplitude  range  in  the  Fourier 
domain,  it  does  not,  however,  constrain  the  values  the  phase  may 
assume.  One  can  qualitatively  argue  that  in  some  sense  the  phase 
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carries  more  "information"  about  the  non-negative  sampled  image 
than  the  amplitude.  This  statement,  which  will  later  be  considered 
in  a  more  formal  presentation,  is  quite  significant  for  the  various 
areas  of  image  processing,  including  holography,  where,  in  fact, 
the  superiority  of  phase  information  has  been  observed  experimen¬ 
tally  (Kermisch,  1970). 

Actually  the  inequalities  (2.5-11)  and  2.5-14)  can  further  be 
strengthened.  The  average  values  of  Figures  2.5-3  and  2.5-4  are 
clearly  larger  than  1/2  and  1/4,  respectively.  It  can,  however,  be 
shown,  and  again  the  reader  is  referred  to  the  original  paper  for  the 
derivation,  that,  1/2  G(0)  and  1/4  G(0,  0)  are  the  appropriate  limits 
for  the  one-  and  two-dimensional  cases,  respectively.  For  the  one¬ 
dimensional  case 


du  s  2G(°) 


(2.5-19) 


and  for  the  two-dimensional  case 


G(u,  v)  j  dudv  s  i  G(0,  0) 


(2.5-20) 


The  implication  of  Equations  (2.5-19)  and(2.5-20)  is  that  for 


no  image,  can  G  actually  assume  the  upper  bound  in  the  Fourier 
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domain.  The  functions  which,  in  fact,  satisfy  Equations  (2.5-19) 
and  (2.5-20)  with  equality  are  for  the  one -dimensional  case: 

|G(u)|  =G(0)  j^l  -Mj  ,  |u|sum  (2.5-21) 


For  the  two-dimensional  case: 

|G(u.v)|  -C(0.0)  [‘-|^][l  -^]-MSum’  lvl£vm 

(2.  5-22) 

It  is  interesting  to  note  that  G  reaches  the  bound  at  a  single  point: 


G 


a"d 

0)- 


Except  for  this  point,  G  as  defined  in  Equations  (2.5-21)  and  (2.5-22) 
lies  below  the  appropriate  non -negative  limit.  The  two  special 
functions,  (2.5-21)  and  (2.5-22),  represent  for  the  optical  case  the 
modulation  transfer  function  for  the  uniformly  lit  slit  and  rectangular 
aperture,  respectively. 

The  various  inequalities  (2.  5-7),  (2.  5-14),  (2.5-19),  and 
(2.5-20),  allow  an  information  theoretic  interpretation  of  the 
Fourier  domain  for  non-negative  signals. 

The  entropy  associated  with  an  image  is  invariant  under  the 


Fourier  transform  as  well  as  any  other  transform  for  which  the 
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Jacobian  of  transformation  is  unity.  If  no  a  priori  information  is 
available,  the  image  entropy  is  uniformly  distributed  in  the  frequency 
domain  by  assumption.  This  type  of  reasoning  yields  upper  bounds 
on  the  entropy  rather  than  entropy  estimates  for  actual  images 
whose  correlation  properties  are  known.  Given,  thus,  that  the 
image  entropy  is  divided  between  amplitude  and  phase,  it  is  im¬ 
portant  to  learn  what  affects  the  constraints  (2.5-7),  (2,5-14), 
(2.5-19),  and  (2.  5 -20)  will  have  on  the  entropy  division.  Assump¬ 
tion  of  no  a  priori  information  implies,  on  the  basis  of  Equation 
(2.5-2)  alone,  that  the  Fourier  domain  represents  a  uniform  entropy 
density  for  spatial  frequencies  below  the  band  limit.  Restriction  of 
the  allowed  amplitude  range  will  proportionally  limit  the  entropy. 

The  ratio  of  the  entropies  with  and  without  the  Lukosz  bound  is  1/2 
and  1/4  for  the  one -dimensional  and  two-dimensional  cases,  respec¬ 
tively.  This  statement  follows  from  the  inequalities  (2.5-19)  and 
(2.5-20).  One  can  argue  that,  for  band-limited,  non-negative 
images,  the  entropy  associated  with  the  phase  is  larger  by  a  factor 
of  2  and  4  for  the  one-  and  two-dimensional  cases,  respectively. 

The  optical  analog  is  the  case  of  incoherent  imaging,  for  which 
it  can  be  argued,  as  Lukosz  did,  that  the  optical  system  by  virtue  of 
its  low-pass  filtering  will  limit  the  information  transfer  by  1/2  and 
1/4  for  one  and  two  dimensions,  respectively. 

The  Lukosz  bound  is  a  significant  contribution  to  the  science  of 
tne  signal  processing  of  non-negative  band-limited  signals.  The 
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implication  of  the  importance  of  phase  over  amplitude  in  digital 
image  processing  is  useful  information  and  has  strongly  motivated 
the  research  in  this  dissertation. 


3.  IMAGE  SOURCE  CODING 


The  transmission  of  data  consists  of  two  distinct  coding  steps: 
source  coding  and  channel  coding.  Schematic  representation  of  the 
classical  communication  problem  was  reviewed  in  Chapter  1.  This 
dissertation  treats  the  image  coding  problem  as  one  which  fits  into 
the  domain  of  source  coding.  This  approach  permits  structural 
separation  of  image  coding  from  the  consideration  of  channel  errors. 

In  this  chapter  various  aspects  of  image  coding  are  considered. 
The  basic  theme  of  the  dissertation  is  that  the  phase  (yet  to  be 
explicitly  defined)  is  the  primary  parameter  whose  fidelity  should 
be  maintained  in  the  coding  process.  The  various  steps  that  con- 
stitute  the  coding  process  are  considered  in  the  context  of  phase 
coding.  The  primary  transform  domain  is  that  of  the  Fourier,  how¬ 
ever.  extension  is  made  to  the  Walsh  domain  as  well.  In  fact, 
successful  utilization  of  phase  in  other  than  the  Fourier  domain  is 
a  discovery  which,  prior  to  this  dissertation,  has  not  appeared  in 
the  literature  as  far  as  the  author  is  aware. 

3.  1  Statistics  of  the  Fourier  Transform 

The  various  coding  schemes  of  Chapters  4  through  6  utilize 
the  properties  of  the  transform  domain.  The  primary  transform  is 
the  Fourier  which  has  extremely  advantageous  properties  from  the 
coding  standpoint.  The  close  similarity  between  the  Fourier  and 
Walsh  decompositions  makes  the  latter  transform  also  useful.  The 
statistical  properties  of  the  Fourier  transform  domain  are  explored 

in  this  section,  the  extension  to  the  Walsh  domain  is  the  topic  of  the 
next  section. 
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Let  I(x)  and  I(u)  be  a  Fourier  transform  pair.  To  simplify 
notation,  the  image  coordinates  are  condensed  into  vector  form. 
Vectors  u,  x  have  a  number  of  components  equal  to  the  dimension 
of  the  coding  problem.  The  monochrome  problem  has  two  dimen¬ 
sions  for  this  case, 


u  =  Ju,  v} 
x  =  jx,y| 


The  frame -to-frame,  or  color  coding  problem  is  of  three  dimen 
sions,  for  this  case. 


u  =  {u,  v,  w| 
x  =  {x,y,t} 

The  vector  notation  permits  statistical  analysis  of  the  Fourier  trans¬ 
form  of  an  image  without  specification  of  the  dimension. 

In  addition,  the  infinite  extent  of  the  image  plane  implies  that 
the  Fourier  domain  is  uncorrelated  in  the  limit  as  the  number  of 
samples  grows  to  infinity  (Davenport  and  Root,  1958,  Section  6-4). 

The  functional  form  of  the  power  spectral  density  is  required, 
if  quantization  of  the  transform  samples  is  to  be  accomplished 
efficiently.  All  transform  coding  techniques  require  an  estimate 
of  the  power  spectral  density,  their  overall  performance  is  largely 
determined  by  how  well  the  power  spectral  density  estimation  is 
accomplished. 
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Information-theoretic  discussion  of  the  frequency  plane,  based 
on  the  Lukosz  bound,  already  implied  a  certain  superiority  of  the 
phase.  Stochastic  consideration  of  the  Fourier  domain  allows  addi¬ 
tional  interpretation,  in  fact,  a  general  definition  of  the  phase.  This 
dissertation  expands  the  phase  concept  to  what  will  be  referred  to  as 
the  unconventional  definition. 

a)  Conventional  Definition 

The  complex  valued  function  I  is  the  sum  of  real  and 
imaginary  components, 

I(u)  =  IR(u)  +  j  Ij(u) 

the  phase  9(u)  associated  with  u  is  normally  defined  as 

0(u)  =  tan'1  TjfuJ/I^u)  (3.  1-1) 

The  definition  in  Equation  (3.  1-1)  is  required  if  the  various  well- 
known  phase-related  deterministic  properties  of  the  Fourier  trans¬ 
form  are  to  be  utilized. 

b)  Unconventional  Definition 

Under  the  assumption,  based  on  experimental  evidence, 
that  IR  and  Ij  are  approximately  Gaussian,  0  is  uniformly  distributed 
and  uncorrelated  for  different  values  of  u,  that  is 
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E|e(Uj),  e(u2)}  =  2/3  n3  6(u  J  -  (3.  1-2) 

and 

e{|Ii(u)|2|  =  e{|ir,„||2}4s,uI  (3.- 13) 

In  most  practical  situations  S(u)  is  a  smooth  surface,  which  means 
that  S(Uj)  Sfm,)  for  ju^  -  u^  j  <  M.  The  expression  ju^  -  u^j  is  the 
Euclidean  distance  for  vectors  u^  and  m,.  For  the  sampled  case,  a 
reasonable  value  for  M  might  be  at  least  5  (in  harmonics).  The 
comment  should  be  interjected  that  the  I  can  be  only  approximately 
Gaussian  since  its  components  are  restricted  in  range  by  the  D.C. 
term  and  for  the  band-limited  case  by  the  additional  Lukosz  bound. 

Based  on  the  smoothness  of  S,  the  following  stochastic 
unconventional  phase  definitions  can  be  made,  with  the  previously- 
made  restriction  Ju^  -  u^  <  M. 

e(U],  u2)  =  tan"1{lK(u1)/IL(u2)|  (3.1-4) 

Subscripts  K  and  L  represent  the  actual  independent  subscript  assign 
ments  from  I  and  R  (imaginary  and  real)  if  u  /  u2<  K  and  L  repre¬ 
sent  different  subscripts  if  Uj  =  u2<  The  following  forms  for  0  are 
allowed  under  the  unconventional  definition 


i 

i 
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efUj,  u^  =  tan"1|lR(u1)/IR(u2)| ,  Uj  / 

0(u1(  u2)  =  tan"  1{lR(H.1)/II(u1)| 

The  following  definition  is  not  permitted 

© <H.i »  Hq)  =  tan“1|lR(u1)/IR(u1)| 

since,  in  this  case,  0  is  a  single  value  rather  than  a  random  var¬ 
iable.  The  stochastic  phase  definition  is  important  because  it  gives 
validity  to  phase  coding  in  domains  other  than  the  Fourier.  Exper¬ 
imental  demonstration  of  the  utility  of  the  stochastic  phase  will  be 
given  in  this  chapter. 

3.2  Extension  to  the  Walsh  Domain 

The  Fourier  transform  of  an  image  tends  to  be  uncorrelated. 
The  existence  of  uncorrelated  samples  permitted  definition  of  the 
generalized  phase.  Although  the  Fourier  transform  is  unique  in 
having  the  above-mentioned  properties,  other  linear  transformations 
may  approximate  the  Fourier  transform  in  some  sense.  One  spec¬ 
ific  implementation  will  involve  the  Walsh  functions. 

Let  f^  be  an  element  of  an  N  component  vector  (that  is 

K  =  1,  2,  ...  N).  Two  distinct  transforms  of  f.  and  a.  and  b.  which 

k  i  j 

are  also  elements  of  N  component  vectors,  therefore 


‘i  =  ?  Gikfk 
k 


(3.2-1) 


b.  =  52  H..f, 

1  rr  ik  k 


(3.2-2) 


where  matrices  G  and  H  are  invertible  matrices  of  order  N.  The 
summation  in  Equations  (3.2-1)  and  (3.2-2)  is  over  N  components, 
the  same  convention  will  remain  in  force  for  the  rest  of  this  section. 

Although,  Equations  (3.2-1)  and  (3.2-2)  can  represent  any 
linear  decomposition,  the  specific  assignment  will  be  made  where 
G  will  represent  the  Walsh  and  H  the  Fourier  decomposition. 

By  s-traightforward  manipulation,  it  can  be  shown  that 


£k  .  £  g-;  aj  .  E  b 


(3.2-3) 


and,  therefore, 


=  ^  ^  °lk Hkj  bj 


(3.2-4) 


The  following  definition  is  introduced  for  notational  convenience 


zi-  =  52  G..  H,  . 
!j  k  Jk  kJ 


(3.2-5) 


Consequently,  the  transform  values  are  related  through  the  linear 
relationship: 
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a-t  = 


?  Vj 


(3.2-6) 


ket  {®i^Xk^}  an<^  orthonormal  basis  vectors  generating 

the  space  in  which  ffc  is  defined.  Note  that  g.(xk)  is  the  k-th  element 
of  the  i-th  vector.  Obviously,  both  i  and  k  have  index  values  1 
through  N. 

For  the  special  case  where  Gki  =  g.(xk)  and  Hk.  =  h.(xk),  it 

1  >J<  1 

is  easy  to  demonstrate  that  G..  =  Gf.  and  H.',  =H?!. 

ik  ki  lk  ki 

It  is  desirable  to  treat  the  a.’s  and  b/s  as  zero  mean  real 
random  variables  and  consider  transformation  of  the  second  order 
statistics.  Clearly, 


M}*  VnkE{bjb4 


(3.2-7) 


If  the  Fourier  designation  is  given  to  H.,  ,  then,  according  to  the 

1 K 

results  of  the  previous  section, 


MM 


jk 


(3.2-8) 


6..  =  1 

Jk 


=  0 


j  =  k 


j  ^  k 


I 


where 
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Consequently, 


EKan}=  EztJVjZ 


(3.2-9) 


Z?  .6? 
J 


(3.2-10) 


The  previous  section  indicated  that  Fourier  transform  sam¬ 
ples  have  Gaussian  distribution.  By  Equation  (3.2-6),  it  is  observed 
that  the  aj's  also  tend  toward  a  Gaussian  stochastic  process.  If  the 

choice  for  G..  is  such  E 
lk 


that  is,  the  a  's  are 


{af,an}  6f,k  E{aj}  ’ 
also  uncorrelated,  one  can  define  amplitude  and  phase  on  pairs  of 

random  variables ,  say  a.,  a  . 

’  l  n 

^  ^{a-6  }  ~  E-|an  |  ,  the  functional  form  of  appropriate  proba¬ 
bility  density  functions  for  the  amplitude  and  phase  should  be  the 
same  as  the  ones  defined  for  the  Fourier  transform.. 

The  specification  of  G  for  the  Walsh  decomposition  can  be 
written  in  terms  of  the  appropriate  orthonormal  basis  vectors. 
Utilizing  the  conventional  notation  (Harmuth,  1972) 


Gki  =  wal  (i’  V 


(3.2-11) 


Walsh  functions  can  be  generated  through  the  following 


difference  equation. 


1 
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wal  (2j  +  P,  x)  - 
(-l.[j/2)+P|wal 


[j.  *(«+{) 


+  (-l)j+P  wal 


[i'K-4 


(3.  2-12) 


P  =  0  or  1;  j  =  0,1,2,  ••• 

wal  (0,x)  =  1  for  -  |sx  S  | 

=  0  otherwise 

and  [ j / 2 ]  is  the  largest  integer  smaller  than  or  equal  to  j/2. 

The  Z  matrix  can  be  generated  by  decomposing  each  Walsh 
function  into  a  Fourier  series.  Walsh  functions  have  similar  sym¬ 
metry  as  the  sines  and  cosines.  Denoting  the  even  and  odd  Walsh 
functions  as  cal  and  sal,  respectively,  it  follows 


wal  (2i,  x)  =  cal  (i,  x) 

wal  (2i  -  1,  x)  =  sal  (i,  x)  (3.  2-13) 

As  previously  indicated,  real  Fourier  decomposition  is  where 
the  basis  functions,  h.'s  are  sine  and  cosine  functions,  similarly  the 
g/s  are  the  cal  and  sal  functions.  Because  of  the  even-odd  sym¬ 
metry  of  both  sets  of  basis  vectors,  even  functions  of  one  set  can  be 
represented  by  only  even  functions  of  the  other  set.  Similar  repre¬ 
sentation  holds  for  odd  basis  vectors.  The  same  symmetry  results 
in  the  following  restriction  for  the  Z  matrix. 


-  0  for  |i-k|=  odd  integer 


(3.2-14) 
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The  specific  result  of  Equation  (3.2-14)  is  that  elements  in 
the  Walsh  domain  which  are  separated  by  odd-number  elements  will 
be  uncorrel-ted.  The  choice  of  adjacent  element  pairs  for  amplitude 
and  phase  specification  is  strongly  motivated  by  the  symmetry 
consideration. 

Although,  simple  functional  form  does  not  exist  for  the  Z 
matrix,  numerical  generation  of  the  elements  can  easily  be  per¬ 
formed  for  specific  transform  pairs.  As  an  example,  consider  the 
Walsh  into  Fourier  decomposition  for  N  =  1024  values.  For  a  spec¬ 
ific  choice  of  l,  e.g.  ,  the  l-th  Walsh  function,  -t-th  row  of  the  Z 
matrix  is  generated.  The  inverse  of  Z  is  similarly  generated  by  the 
decomposition  of  particular  sine  and  cosine  functions  into  Walsh 
functions.  Numerical  examples  are  shown  in  Figures  3.2-1  through 
3.2-8.  These  figures  indicate  the  recognized  similarity  between  the 
Walsh  and  trigonometric  functions.  It  is  interesting  to  observe  that 
diagonal  elements  of  Z  dominate  each  row. 

For  completeness,  the  "fast"  computability  of  the  Walsh  and 
Fourier  transforms  should  be  pointed  out.  The  straightforward 
application  of  Equations  (3.2-1)  and  3.2-2)  requires  operation 
(operation  4  one  complex  multiplication  for  Fourier  and  4  one  addi¬ 
tion  or  one  subtraction  for  Walsh).  The  particular  form  of  G  and  H 
permits  a  much  more  rapid  implementation  of  these  transforms 
where  the  number  of  operations  is  reduced  to  N  log  N  (Andrews, 

1970;  Harmuth,  1972;  Cooley  and  Tukey,  1965). 
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Figure  3.2-5.  Sixty- Third  Walsh  Function 
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Figure  3.2-6.  Fourier  Decomposition  of  Sixty- Third 

Walsh  Function 
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The  'fast'  algorithms  are  important  for  efficient  coding 
implementation.  Particularly  for  large  data  blocks,  the  efficiency 
factor  N/log  N  can  be  significant.  The  "fast"  algorithm  is  available 
for  the  Z  matrix  as  well  and  it  was  utilized  for  the  generation  of 
Figures  3.2-1  through  3.2-8. 

3.3  Quantization 

The  continuous  image  parameters  must  be  expressed  in 
discrete,  that  is  to  say  quantized,  form  before  numerical  operation 
on  them  can  be  performed.  Formally,  quantization  is  equivalent  to 
a  noninvertible  mapping  of  the  real  numbers  onto  a  finite  set  of 
integers.  It  is  also  equivalent  to  a  one-to-one  mapping  of  finite  or 
semiinfinite  sections  of  the  real  axis  to  a  finite  set  of  integer 
numbers. 

According  to  the  last  definition,  each  member  of  a  set  is 
assigned  an  integer  designation.  All  members  of  a  set  are  assigned 
the  same  integer  assignment.  Conversely,  given  a  particular  integer 
assignment,  no  unique  determination  of  the  original  real  value  can 
be  made. 

It  is  obviously  imperative  to  optimize  the  appropriate  quanti¬ 
zation  procedures.  This  step  involves  the  selection  of  the  optimum 
quantization  rules,  based  on  the  statistical  model  of  the  parameter 
to  be  quantized. 

The  discretization  of  a  continuous  parameter  always  results 
in  a  permanent,  hopefully  negligible,  distortion.  This  distortion  may 
appear  as  an  effective  noise  term  or  an  actual  structural  distortion. 
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For  the  first  case,  the  number  of  quantization  levels  are  large  and 
the  appropriate  effects  can  be  modeled  by  additive  white  noise.  The 
second  case  occurs  for  coarse  quantization,  for  which  the  nonlinear 
aspect  of  quantization  dominates. 

The  following  basic  model  will  be  considered.  Let  x  be  a 
continuous  random  variable  with  a  probability  density  function  P(x). 
The  functional  form  of  the  quantization  can  be  expressed  in  terms  of 
the  previously-introduced  rect  function  as 


Q(x)  = 


N 


i  =  l 


/N 

x.  rect 
J 


- - -  (x  -  i  (x.  +x.  ,)) 

x .  -  x .  ,\  2  j  1-1/ 

J  J-l  J  J  ' 


(3.3-1) 


In  Equation  (3.3-1)  there  are  N  integer  assignments.  To  each 
integer  another  real  value,  x\  is  assigned.  The  'x\  is  the  reconstruc¬ 
tion  value  or  the  estimate  of  x.  The  specification  of  the  parameters 
in  Equation  (3.3-1)  should  be  such  that  x^  should  closely  "approxi¬ 
mate"  x.  If  the  mean -squared  error  (MSE)  is  the  performance 
measure,  then 


Error  =  min  j  J  P(x)  (Q(x)  -  x)2  dxj  (3.  3-2) 

Where  minimization  is  performed  over  allxj's  and  x^'s  for  a  given 
N.  The  solution  of  Equation  (3.3-2)  is  well  known  (Max,  I960);  it  is 

xj  -  Vi>’ j  =2"-' 


N 


(3.3-3) 
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and 

Xj+1 

(x  -  x\)  P(x)  dx  =  0,  j  =  1,  2,  .  .  .N  (3.3-4) 

x . 

J 

Equations  3.3-3)  and  (3.3-4)  can  be  solved  by  iterative  tech¬ 
niques  for  given  density  functions.  A  note  of  caution  should  be  inter¬ 
jected.  Equations  (3.3-3)  and  (3.3-4)  are  formal  solutions  given 
the  P(x).  In  image  coding,  the  relevant  parameters  are  themselves 
estimated.  Utilization  of  an  erroneous  model  may  result  in  a  poor 
quantization  procedure  even  though  the  solutions  in  Equations  (3.3-3) 
and  (3.3-4)  are  faithfully  followed. 

If  P(x)  is  uniform  over  a  finite  region,  say  [xQ,  xNJ  , 

Equation  (3.3-1)  becomes  the  uniform  quantizer. 


(3.3-5) 


Another  often -used  quantization  strategy,  known  as  compand¬ 
ing  (Smith,  1957)  involves  a  two-stage  process.  First  x  is  mapped 
into  y,  y  =  f(x),  which  is  random  variable  uniformly  distributed 
between  [0,  1],  The  random  variable  y  is  operated  on  by  the  uniform 
quantizer.  The  reconstruction  levels  of  x  and  y  are  determined  by 

the  inverse  mapping,  f"*,  (£.  =  f  ^  y'.).  The  mapping  is  the  distribu- 

J  J 


tion  function  of  x: 
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y  =  f(x)  ~  f  P(u)  du 


(3.3-6) 


The  reconstruction  levels  x^  and  are  uniquely  related  by 
the  one-to-one  mapping,  £.  By  construction,  y^  occurs  with  equal 
probability,  thus,  this  case  corresponds  to  the  maximum  entropy  the 
quantized  values  may  have.  This  latter  type  of  quantizer  procedure 
is  suboptimal  when  MSE  is  the  performance  criterion;  however,  for 
numerous  density  functions,  optimum  performance  is  closely 
approached. 

Quantization  schemes  can  be  closely  approximated  by  sim¬ 
plified  procedures  for  fine  quantization  (Panter  and  Dite,  1951). 

The  coding  schemes  of  Chapters  4  through  6  involve  coarse  quanti¬ 
zation  in  the  transform  domain,  thus,  these  procedures  are  not 
relevant  and  will  not  be  further  explored. 

3.4  Amplitude  vs  Phase  Quantization  Effects 

The  underlying  theme  of  this  dissertation  is  the  superiority 
of  phase  information.  It  is  particularly  relevant  to  consider  distor¬ 
tions  introduced  by  the  quantizing  process.  In  this  subsection,  the 
generalized  phase  and  amplitude  will  be  considered.  The  assumption 
is  made  that  application  of  the  image  transform  (Fourier  or  Walsh) 
results  in  uncorrelated  samples.  Amplitude  and  phase  are  defined 
over  pairs  of  values  as  in  subsection  3.3  under  the  unconventional 


definition. 
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Let  0  and  r  be  a  phase  and  amplitude  pair  where  6  is  uniformly 
distributed  in  [  -tt,  n]  and  r  has  Rayleigh  distribution  (Thomas,  1968, 
Chapter  4).  The  following  procedure  will  be  implemented.  Ampli¬ 
tude  and  phase  will  be  independently  quantized,  one  at  a  time,  and  the 
appropriate  MSE  generated  will  be  compared, 
a)  Phase  Quantization 

The  uniform  quantizer  is  optimum  for  the  phase.  The 
actual  error  in  the  N  level  quantization  process  of  a  single  phase 
value  in  one  of  the  N  regions,  say  [0,  2tt/N],  is 

2 

.  TT 

2  ie  "lN 

Error  =  A  e  -  e  (3.4-1) 

2 

The  A  is  the  energy  associated  with  random  variable  r.  Mean- 
squared  phase  error  (MSE)  is  obtained  by  averaging  Equation  (3.4-1) 
over  0  and  all  N  quantization  regions.  Because  of  the  symmetry  in 
0,  each  of  the  quantizing  sections  is  statistically  equivalent. 


therefore 
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Maximum  entropy  quantization  (companding)of  sub¬ 
section  3.6  will  be  utilized  (Habibi,  1973).  The  function  f  is  re¬ 
quired,  which  is  the  appropriate  distribution  function: 


f(r) 


ds 


=  1  -  e 


2 


(3.4-5) 


The  inverse  of  f  is  also  available  in  closed  form  of 


f‘\s)  =  a  yj - 2  log  (1  -  u)  (3.4-6) 

Let  a  =  l;  the  RMSE  for  the  Rayleigh  process  using  the  formalism  of 
subsection  3.  3  is 


RMSE  = 


(r 


r  e 


(3.4-7) 


Note  that  the  energy  for  the  normalized  (CT  =  1)  Rayleigh  process  is  2, 


(3.4-8) 
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Evaluation  of  Equation  (3.4-7)  requires  numerical  tech¬ 
niques.  The  appropriate  numerical  integration  utilized  a  Hermitian 
sixth  order  formula.  Each  region.  [r._1#  r.],  was  evaluated  at  100 
equidistant  values.  In  addition  to  the  integrand  (denoted  by  R),  the 

particular  numerical  integration  requires  evaluation  of  the  first  and 
second  derivatives  as  well. 


R  (r)  = 


-r2/2 


(3.4-9) 


R  (r)  = 


_  3/N 
2r  r. 


(3  ft) 


•r2/2 


(3.4-10) 


rV)  = 

(3.4-11) 

+  /6-3^r  -4?.)=-r2/2 

Numerical  integration  is  performed  over  each  of  the  N 
sections  and  summation  then  performed  over  the  N  sections.  The 
RMSE  due  to  phase  or  amplitude  quantization  is  shown  in  Table  3.4-1. 
The  relative  importance  of  phase  over  amplitude  is  effectively 
demonstrated  by  this  table,  particularly  for  coarse  quantization. 
Ignoring  amplitude  completely  causes  21.5  percent  error  of  the  total 
image  energy.  The  single -level  quantizer  collapses  the  entire  range 


2r.  r 

l 


ft-’ 


3  2/\ 

r  +  10  r  r. 

i 
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of  the  random  variable  into  a  single  a  priori  known  value.  Conse¬ 
quently,  all  randomness  associated  with  that  variable  is  removed, 
thus  the  associated  entropy  is  zero.  Essentially,  the  same  result  is 
obtained  in  holography  (Kermish,  1970)  utilizing  a  much  more  com¬ 
plicated  physical  model.  The  phase  requires  2  bits  (N  =  4)  to  main¬ 
tain  the  same  amount  of  MSE  that  is  achieved  by  zero  bits  for  ampli¬ 
tude.  Similarly,  1  bit  amplitude  is  "worth"  3  bits  of  phase.  Since 
the  majority  of  transform  values  in  the  experimental  chapters 
requires  a  very  low  degree  of  quantization,  the  quantitative  results 
of  Table  3.4-1  are  highly  relevant,  and  demonstrative  of  the  phase 
superiority. 


TABLE  3.  4-1 

THE  RMSE  INTRODUCED  BY  PHASE  AND 
AMPLITUDE  QUANTIZATION 


Number  of 
Quantum  Levels 

RMSE 

Phase  Quantization 

RMSE 

Amplitude  Quantization 

1 

2.  0 

0.215 

2 

0.  73 

0.  042 

4 

0.  20 

0.  025 

8 

0.  05 

0.  011 

16 

0.  013 

0.  0048 

32 

0.  0031 

0. 0020 
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3.5  Non-Linear  Effects  of  Phase  Quantization 

The  general  comment  was  made  in  subsection  3.5  regarding 
the  nonlinearity  of  the  quantizing  process  which  is  quite  significant 
for  the  case  of  coarse  quantization.  The  appropriate  effects  are 
structural  and  for  them,  the  MSE  may  not  be  a  descriptive  parameter. 

The  importance  of  phase  information  has  been  emphasized. 

Also,  the  achievement  of  a  high  degree  of  redundancy  reduction 
requires  that  most  transform  domain  samples  be  quantized  at  few 
quantum  levels.  Therefore,  it  is  of  value  to  demonstrate  the  type 
of  global  distortion  that  results  from  quantizer  nonlinearity.  Spec¬ 
ifically,  coarse  phase  quantization  will  be  considered. 

The  effect  of  phase  quantization  has  been  previously  considered 
in  relation  to  holography  (Goodman  and  Silvesteri,  1970;  and  Dallas, 
1971,  a  and  b).  Their  analysis  is  applicable  to  image  coding,  with 
some  important  modifications.  The  primary  difference  is  that  unlike 
a  digital  image  display,  in  holography,  the  final  image  inherently  is 
an  energy  representation.  Consequently,  extraneous  images  and 
ghosts  diminish  quadratically  with  the  number  of  quantum  levels  for 
holography.  A  similar  dependence  is  linear  for  image  coding,  thus 
the  distortion  is  more  emphasized. 

In  the  following,  conventional  phase  definition  will  be  utilized 
for  the  two-dimensional  case.  Let  g  and  G  be  a  Fourier  transform 
pair : 
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g(x,  y)  e*2rrj(xu  +  yv)  dxdy  =  G(u,  v)  =  |  G(u,  v) |  exp  j8(u,  v) 

(3.5-1) 


The  phase,  0,  is  linearly  quantized  to  N  levels  and  the  inverse 
Fourier  transform  is  performed.  The  result  is  denoted  by  g(x,  y) 
and  it  is  of  the  following  form  (Dallas,  1971,  a): 


00 

'gtx.y)  =  ^  sinc|m  4)gm(X,yl  (3.5-2) 
m  =  -“  1  1 

The  sine  function  can  be  expanded  as 


sine  (m  +  1/N)  =  sine  (1/N)  (-l)m/(mN  +  1) 


and  g 


m 


is  defined  as 


«m(x' 


oo 


exp  j(mN  +  1)  9(u,  v) 


X  exp  2rrj(ux  +  vy)  du  dv 


(3.5-3) 


Note  that  for  m  =  0,  the  gm  is  the  original  image.  For  m  ^  0,  gm 
represents  extraneous  images  or  "ghosts.  " 


The  following  additional  observation  can  be  made 

a)  From  Parseval  theorem: 


f.I  I8"'1*’''1 


2 


dx  dy 


(3.5-4) 


for  all  integers  m  and  n 

b)  Each  ghost  image  decreases  in  intensity  by  the  factor 
l/(mN  4  1)  relative  to  the  unquantized  original. 

c)  In  holography,  as  a  result  of  the  squaring  operation  per¬ 
formed  by  the  optical  system,  the  ghost  image  intensity  decrease 
factor  is  l/(mN  +  1)  .  In  digital  processing,  this  factor  is 

1  /  (mN  +  1). 

d)  The  largest  ghost  is  g  j,  whose  relative  weight  is 
1/(1  -  N)  with  respect  to  g^. 

e)  One  can  also  observe  from  Equation  (3.5-2)  that 

lim 

N  oo  8  =  8*  since 


sine  [m]  = 


I  0  m  /  0 


1  m  =  0 


(3.5-5) 


In  digital  implementation,  the  continuous  Fourier  transform. 
Equation  (3.5-1),  is  replaced  by  its  discrete  equivalent,  the  Fourier 
series.  The  implied  periodicity  of  the  latter  results  in  the  reappear¬ 
ance  of  ghost  images  which  have  been  cyclically  shifted  out  of  the 
basic  image  region. 
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It  is  possible  to  interpret  the  various  ghost  images,  and  the 
display  of  the  distorted  image  can  be  quite  dramatic.  A  computer 
experiment,  similar  to  one  which  was  holographically  implemented 
by  Dallas,  was  performed. 

Except  for  the  64  x  64  element  upper  right  sub-block,  the 
"couple"  image  was  zeroed  out.  This  image  was  Fourier- 
transformed  and  the  phase  was  uniformly  quantized  at  N  »  2,  3,  4, 
and  32  levels.  The  final  images  are  reconstructed  via  the  inverse 
transform.  The  result  of  the  experiment  is  shown  in  Figure  3.5-1. 

The  worst  case,  m  =  -  1,  requires  special  attention  for  the 
two -level  quantizer.  Note  that  the  weight  factor  for  this  case  is 
identical  for  m  =  -  1,  and  m  =  0.  Furthermore,  from  Equation 
(3.5-3) 

oo 

g_L  (x,  y)  =  Jj\c(n>v)\  exp  [- je(u,  v)]  exp  2nj(ux  +vy)dudv 

(3.5-6) 

or,  equivalently 

|g_L(x,y)|  =|g0(-x'-y)|  (3.5-7) 

By  experimental  construction,  gg(-x,  -y)  does  not  overlap 
with  gp(x,  y),  the  largest  ghost  image  is  the  "mirror  image"  of  the 


original. 
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(a)  Original 


This  page  is  reproduced  at  the 
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(b)  2  Level  Quantizer 


(c)  3  Level  Quantizer 


I 


(d)  4  Level  Quantizer 


(e)  32  Level  Quantizer 


i 


Figure  3.5-1.  Demonstration  of  Phase  Quantization 
Effects  for  Fourier  Domain 
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Although  the  above  described  experiment  is  rather  specialized, 
it  does  emphasize  the  importance  of  the  global  nature  of  distortion 
introduced  by  the  phase  quantizer  nonlinearity.  Availability  of  the 

relevant  MSE  provides  little  if  any  information  about  the  nature  of  the 
distortion. 

The  digital  experiment  was  repo£  ed  for  the  Walsh  domain  and 
results  are  shown  in  Figure  3.5-2.  The  significant  difference  can  be 
explained  by  the  symmetry  of  the  decomposition  rather  than  by  the 
functional  properties  of  the  eigenfunctions.  Actually,  the  analysis 

related  to  Figure  3.5-2  is  much  simpler  than  the  one  associated  with 
the  Fourier  care. 

Consider  the  decomposition  of  f(x.,  y  )  in  terms  of  the  even  and 

J 

odd  Walsh  function: 


N/2-1 

s 


n/2-1 


S  ^bcc(k’  1)  cal(k,  x. )  cal(  1,  y^) 


+  b8c(k*  !)  sal(k,x.)  cal(l,yj.) 

+  bcs(k»  cai(k,x.)  sal(l,  y.) 

J 

+  bss(k,  1)  sai(k,  x.)  sal(l,yj)| 


(3.5-8) 


(a)  2  Level  Quantizer 


(b)  3  Level  Quantizer 


(c)  4  Level  Quantizer 


(d)  32  Level  Quantizer 


Figure  3.5-2.  Demonstration  of  Phase  Quantization 
Effects  for  Walsh  Domain 


This  page  is  reproduced  at  the 
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Ihe  conventional  sequency-ordered  Walsh  transform  will  yield 
the  b"  matrix  for  the  two-dimensional  case.  Consider  the  following 
"unconventional"  p’  ase  definition  such  that 

N/2-1  N/2-1 


f<xi-  Vj>  = 


^  ^  J  Bc(k,  1)  cos  e^k,  1)  cal(k,x.)  cal(l,  y^.) 


+  Bc(k,  1)  sin  ej(k,  1)  sal(k,x.)  cal(l,  y.) 
t  Bg(k,  1)  cos  02(k,  1)  cal(k,x.)  sal(l,  y.) 

+  3a(k»  1)  sin  02(k,  1)  salfk.x^  sal(l,  y^)  (3.  5-9) 

where 


B2  =  b2  +  b2 
c  cc  sc 


B2  =  b2  +b2 
s  cs  ss 


-1 


0 ,  =  tan  (b  / b  ) 
1  sc  cc 


e,  =  tan  1  (b  /b  ) 
2  ss  cs 


; 

( 
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For  the  particular  original  of  Figure  3.5-1,  the  coefficients  are 

equal,  b.c(k,  1)  =  b^k.  1)  =  b^k,  1)  =  bcs(k,  1)  =  b(k,  1).  This  can 

be  shown  by  letting  f(x.,y.)  /  0  in  Equation(3.  5  - 18).  Because  of  the 

symmetry  of  the  image,  it  follows  that  f(-x.,y.)  =  f(x.,  -y.) 

^  J  ^  J 

“Yj )  =  °*  Simple  algebraic  manipulation  of  Equation  (3.5-18) 

will  yield  the  equality  of  the  coefficients.  Consequently,  9=6  =n/4. 

1  2 

If  f(x.,  y.)  is  the  image  in  Figure  3.5-la  Equation  (3.5-8) 
becomes 


N/2^1 

W =  2_, 


-1  N/2-1 


t>(k,  l)/cal(k,x^)  cal(  1 ,  y. ) 
1  J 


+  sal(k,x.)  cal(l,  y.) 
+  calfk.x^  sal(l,  y^) 

+  sal(k,  x^)  sal(l,  y^)| 


(3.5-10) 


Consider  the  application  of  a  two -level  uniform  phase 
quantizer-  Equation  (3,5-10)  will  become 


N/2-1  N/2-1 


V  = 


=0 


+  sal(k,  x^ )  sal(l,  y^j 

(3.  5-11) 


The  result  for  the  three-level  quantizer  it 


N/2-1  N/2-1 


f(Xi'yj)  '"  ^  ^  y=  b(k,  l)jcal(k,x.)  cal(l,y^) 


+  Cr.l(k,x.)  sal(  1 ,  y.)| 

(3.  5-12) 

For  the  four -level  quantizer,  the  quantized  result  is  identical 
to  the  original  (unquantized).  The  symmetry  of  the  quantized  images 
in  Figure  3.  5-2  is  equivalent  to  the  symmetry  expressed  by  the 
related  Equations  (3.5-21)  and  (3.5-22). 

The  Walsh  domain  phase  quantization  experiment  provides 
another  indication  regarding  the  nonlinear  nature  of  the  quantizer. 

The  phase  definitions  Ql  and  02  may  appear  artificial,  how¬ 
ever,  it  is  convenient  in  the  sense  that  they  are  defined  on  adjacent 
transform  pairs  in  the  conventionally  ordered  two-dimensional  Walsh 


transform. 
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3.6  The  a  Processor 

In  the  context  of  the  phase  superiority  vs  amplitude  the  per¬ 
formance  of  the  so-called  a  processor  is  not  unexpected. 

Consider  the  image  Fourier  transform  for  the  two-dimensional 

case 


V)  =  ff  I(x,  y)  e-2ni(xu  +  yv)  dx  dy 

-00 

=|T(u,v)|  e“j0(u'v>  (3.6-1) 

The  a  processor  is  defined  as  the  nonlinear  operator,  T  which 

CK 

raises  the  transform  amplitude  to  the  power  a: 

TJl(u«v)}  =  |T(u,  v)|  a  e"j0(u'  v)  (3.6-2) 

Consider  tee  effect  for  ^€[0,  1],  One  can  explicitly  designate  the 
transform  amplitude,  I(u,  v)  by  two  terms  where  R(u,  v)  is  the  image 
power  spectral  density  and  r(u,  v)  the  amplitude  fluctuation  around 
the  power  spectral  density: 

|l(u,v)|  =  R(u,  v)  +  r(u,  v)  (3.6-3) 

therefore, 

H(u,  v)T(u,  v)  ={H(u,v)  R(u,v)  +  H(u,  v)  r(u,  v)  f  e 'j0(u' v) 


(3.6-4) 
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Here  H(u,  v)  is  a  linear  filter.  Consider  the  application  of  T  oc[0,  1]: 
ToI(u,v)  =  |l(u,v)|  ae-j6(u'v) 


|R(u,v)  +  r(u,  v)  Q'e"j®^u»v) 


R0'  +  oR0"1  r 


(3.6-5) 


The  ratio  of  the  amplitude  fluctuation  and  the  power  spectral 
density  has  decreased  in  Equation  (3.6-5)  from  r  (u,  v)/R(u,  v)  to 
cvr(u,  v)/R(u,  v).  One  guarded  observation  is  that  amplitude  entropy 
has  decreased  by  an  amount  related  to  (1  -  a). 

For  a  =  0,  image  transform  amplitudes  are  identically  unity. 
Consequently,  the  image  in  this  case  became  a  white  process,  since 


image  power  spectral  density  is  also  a  constant.  The  a  =  0  case 
demonstrates  two  interesting  image  properties.  First,  under  con¬ 
ventional  ergodic  assumptions  the  image  becomes  uncorrelated.  Yet, 
visual  inspection  of  the  appropriate  images  indicates  (Figure  3.6-1) 
that  basic  image  features  have  not  changed.  The  a  =  0  filter  dras¬ 
tically  changed  image  statistics,  yet  the  apparent  visual  image 
structure  remained  relatively  unaltered. 

3.7  Phase-Only  Image  (Polynomial  Magnitude  Fit) 

The  a  processor"  has  decreased  the  amplitude  entropy  in  the 
transform  domain,  however,  it  also  changed  the  image  power  spectral 
density.  It  is  important  to  separate  the  two  effects.  An  approximate 


i 


Figure  3.6-1.  Demonstration  of  the  a  Processor 


This  page  is  reproduced  at  the 
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reproduction  method  to  provide 
better  detail. 
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linear  inverse  filter  to  T  could  restore  the  power  spectral  density 

a 

to  its  original  form.  A  more  straightforward  technique  is  to  fit  a 
particular  type  of  surface  to  the  image  transform  amplitudes.  This 
second  approach  is  considered  in  this  section. 

Consider  the  two  dimensional  transform  domain  of  an  image, 
T(u,  v).  It  is  not  necessary  to  specify  the  particular  transform.  The 
discretized  version  of  I  will  be  used,  such  that  transform  parameters 
u  and  v  are  integers. 

The  image  transform  amplitudes  will  be  least  square  fitted  by 
a  two-dimensional  surface,  Z(u,  v)  of  the  following  form: 

Z(u,  v)  =  R (u,  v)|a0Q  +  a1Q  u  +  aQ1  v  +  a2Q  u2  +.  .  .  +  aQN  vN| 

(3.7-1) 


or  in  a  more  compact  notation: 


N-l  N 


Z(u,v)  =  R( 


"•  v>y^aii u* vj 


(3.7-2) 


The  weight  function  R(u,v)  is  specified  in  advance  and  the 
coefficients  a^.'s  are  the  unknowns  to  be  determined.  For  a  given  N, 
the  number  of  coefficients  is  1/2  (N  +  1)  (N  +  2). 

The  mathematical  objective  is  to  minimize  the  mean-square 
deviation  between  I  I(u,  v)|  and  Z(u,  v),  that  is 


78 


E  E 

U  V 


N-l 
R(u.v)  £ 
j=0 


N 


E 

i=0 


vJ  -|T(u,v)| 


2 


minimum 


(3.7-3) 

The  minimization  is  accomplished  by  differentiating  (3.7-3) 

with  respect  to  a.  ,  for  k  =  1,  .  N-/-1  m  i,  . 

kl  '  •  *  • 1N ><'-■*>  •  •  .  N  -  k  and  solving  the 

1/2  (N  +1)  (N  +2)  linear  equations. 


E  E 

U  V 


R' 


,  f  N-l  N  .  1 

>.v)  E  E  a..uV  ukv> 

.  j=0  i=0  lJ 


=  E  E  R{u,  v)|T(u,  v)|  ukv1 

U  V 


(3.7-4) 


Equation  (3.7-4)  can  be  rewritten  in  the  following  matrix  notation 


1  U  V  UV  .  .  V 


EE 


R  (u,  v) 


N  .  N 

v  1  u  v  uv  .  .  v  a. 


-EE 


R(u,  v)  |  T(u,  v )  | 
R{u,  v)  |  I(u,  v ) |  u 
R(u,  v)  |  I(u,  v ) |  v 


(3.7-5) 


R(u,  v)  I(u,  v)  V 


Equation  (3.7-5)  is  in  the  form  of  a  conventional  linear  matrix 
equation  with  the  column  matrix  of  the  a./s  being  the  unknown.  For 
a  given  image  transform  and  a  specified  weight  function,  Equation 
(3.7-5)  may  be  solved  by  many  conventional  techniques  for  the  solu¬ 
tion  of  systems  of  linear  equations  (Blum,  1972). 

The  actual  least  square  fit  is  dependent  on  the  choice  of  the 
weight  function,  R(u,  v).  Note  also  that  Z2(u,  v)  is  an  estimate  of  the 
image  power  spectral  density.  Any  a  priori  image  information  should 
be  incorporated  into  R(u,  v). 
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Under  the  conventional  separable  Markov  model,  the  image 
correlation  function  is  of  the  form  e”Q'^Xf  +  M  *  and  the  appropriate 
power  spectral  density  is  of  the  form  (u2  +y2)(vZ  +<y2)]"1.  A 
reasonable  choice  of  R(u,  v)  can  therefore  be  picked  as 

R(u,  v)  =  [(u  +  a^)(v2  +  <y2)]  (3.7-6) 

For  the  adjacent  element  correlation  of  0.  95  the  value  of  a  is  In  0.  95. 

The  utilization  of  Equation  (3.7-6)  for  the  least  square  fit 
problem  indicates  how  good  (or  bad)  the  Markov  model  is.  If  the 
Markov  mode)  perfectly  represented  the  image  statistics,  except  for 
the  Aqq  term,  all  other  coefficients  would  be  zero.  The  ratios  of  the 
appropriate  coefficients  (e.g.,  A^./A^q,  i  +  j  >0)  provides  quantita- 
tive  information  on  the  deviation  between  the  actual  power  spectral 
density  and  the  one  predicted  by  the  Markov  model. 

The  replacement  of  the  individual  amplitude  values  by  the 
appropriate  related  power  spectral  density  values  provides  an  impor¬ 
tant  demonstration  of  the  phase  superiority.  The  L2  image  values  for 
an  L  X  L  image  are  represented  by  1/2  L2  amplitude  and  1/2  L2 

phase  values.  The  power  spectral  density  surface  is  prescribed  by 

2 

a  negligible  (relative  to  1/2  L  )  number  of  coefficients.  The  avail¬ 
ability  of  the  least  square  fitted  surface  permits  the  replacement  of 
2 

1/2  L  values,  in  effect,  by  a  few  parameters. 

Equation  (3.  7-5)  was  implemented  numerically  utilizing  the 
Fourier  domain.  The  Markov  model  was  used  for  the  weight  function 
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with  0.95  as  the  element-to -element  vertical  and  horizontal  corre¬ 
lation.  The  highest  degree  of  the  two-dimensional  polynomial  was  2 
through  5  for  the  five  cases  considered.  The  respective  number  of 
terms  in  the  polynomial  ranged  from  6  through  21.  Table  3.  7-1 
shows  the  various  cases. 


Table  3.  7-1 

Degree  and  Number  of  Terms  in  the 
Surface  Fitting  Polynomials 


N  (1/2)(N  +  1)  (N  +  2) 

1  3 


2 


6 


3 


10 


4  15 

5  21 


The  images  generated  by  the  above -outlined  procedure  have 
good  visual  correlation  with  the  original  (Figure  3.  7-1).  The  high 
spatial  frequency  details  are  completely  preserved.  Not  unex¬ 
pectedly,  the  basic  apparent  distortions  are  in  the  very  low-frequency 
region.  Generally,  it  is  the  low-frequency  region  which  does  not 
lend  itself  to  good  statistical  characterization.  The  reason  is  that 
the  low-frequency  amplitudes  can  be  recovered  from  a  very  coarsely 
sampled  image,  thus  the  law  of  large  numbers  which  is  always 
implied  in  an  ergodic  approximation,  does  not  apply.  On  the  other 
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(b)  21  Term  Expansion 


(c)  15  Term  Expansion 


(a)  Original 


This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 


(d)  10  Term  Expansion  (e)  6  Term  Expansion 


Figure  3.  7-1.  Demonstration  of  Amplitude  Polynomial  Fit 
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hand,  one  may  be  too  generous 
low-frequency  region  since  the 


in  allocating  extra  bits  for  the  very 
impact  on  the  overall  bit  rate  will  be 


negligible. 

The  pictorial  representation  of  the  actual  polynomial  surfaces 
is  shown  in  Figure  3.  7-2  while  the  calculated  coefficients  are  given 
in  Table  3.7-2.  The  large  value  for  coefficients  other  than  the  a 
term  indicates  that  the  exponential  Markov  correlation  model  ^ 
requires  higher  ordei  corrections. 


The  amplitude  surface  fitting  procedure  could  be  utilized  on 
the  development  of  an  actual  tran.form  coding  algorithm,  however, 
it  was  abandoned  in  favor  of  the  recursive  approach  which  is  the  topic 
of  Chapters  4  through  6.  The  solution  of  Equation  (3.  7-5)  and  the 
recalculation  of  the  amplitude  surface  is  likely  to  generate  such 
additional  computation  load  in  addition  to  the  actual  transform 
algorithm,  that  any  practical  implementation  would  be  prohibitive. 

For  the  fifth  degree  polynomial  approximation  a  21st  order  matrix 
equation  must  be  solved.  Each  surface  element  recalculation 
requires  in  excess  of  21  addition  and  multiplication  operations.  The 
latter  operations  amount  to  a  higher  number  of  arithmetic  steps  than 
required  by  the  full  size  Fourier  transform.  In  auction,  both  the 
solution  of  the  matrix  equation  and  the  reconstruction  of  the  surface 
are  somewhat  ill-conditioned.  The  numerical  implementations  of 
this  section  were  done  on  a  60-bit  wordlength  computer.  It  is 
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This  page  is  reproduced  at  the 
back  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 


(a)  Original 


(b)  21  Term  Expansion 


(c)  15  Term  Expansion 


(d)  10  Term  Expansion 


(e)  6  Term  Expansion 


Figure  3.  7-2.  Fourier  Domain  Display  of  Polynomial 
Fitted  Amplitudes 
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anticipated  that  the  round-off  errors  might  not  be  negligible  had  the 
same  calculations,  particularly  the  matrix  equation,  been  performed 
on  a  computer  with  shorter  word  length.  In  which  case,  the  require¬ 
ment  for  double  precision  would  further  increase  the  computational 
load  of  the  coding -decoding  procedures. 


_ _ Number  of  Coefficients 

Coefficients  6  "1  10  I  15 


4.  EXPERIMENTAL  RESULTS  I  (MONOCHROME) 

I  he  concepts  developed  in  the  preceding  chapters  have  been 
implemented.  Computer  algorithms  have  been  developed  for  the 
coding  and  decoding  of  various  images.  This  chapter  considers  the 
algorithm  for  monochrome  images. 

For  practical  reasons,  the  coding  algorithms  only  included 
digital  input  and  output.  For  the  monochrome  image  coding  examples, 
the  input  is  a  square  image  sampled  over  a  256  x  256  grid.  Each 
sample  is  linearly  quantized  to  256  levels. 

The  significant  achievement  of  the  adaptive  phase  coding  process 
discussed  here  is  that  the  transmitter  is  slaved  to  the  receiver  with- 
out  any  overhead  information.  Yet  complete  adaptivity  is  possible, 
as  well  as  arbitrary  sample  reduction.  The  drawback  of  adaptive 
procedures  is  the  requirement  for  large  buffers.  This  requirement 
is  unavoidable  but  it  is  not  likely  to  be  important  in  the  environment 
of  computer-to-computer  communication.  In  this  case,  the  undecoded 
images  can  easily  be  stored,  for  example,  on  magnetic  tapes. 

The  most  demanding  computational  step  is  the  large  size, 

256  by  256,  image  transforms.  It  is  interesting  to  note  that  the 
computational  complexity,  that  is  the  number  of  arithmetic  operations, 
increase  rather  slowly  from  the  case  when  the  sub-block  transforms 
are  replaced  by  one  single  large  transform.  For  example,  the  ratio 
of  the  number  of  operations  for  the  entire  image  transform  (256  x  256) 
vs  16  x  J6  sub-blocks  is  log  256/log  16  =  2.  A  factor  of  two  increase 

in  arithmetic  complexity  is  not  too  extreme  in  computer  implementa¬ 
tion. 
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The  2  >6  x  256  size  is  generally  too  large  to  permit  the  two- 
dimensional  transformation  entirely  in  core.  The  cost  of  the  addi¬ 
tional  I/O  operations  should  also  be  included. 

4.  1  Description  of  the  Algorithm 

The  importance  of  image  representation  by  amplitude  and  phase 
was  demonstrated  in  Chapter  3.  In  particular,  the  phase  superiority 
was  established.  The  coding  algorithm  should  incorporate  these 
important  properties  of  the  transform  domain. 

The  following  assumptions  are  made:  (1)  the  transform  values 
are  uncorrelated  and  normally  distributed,  (2)  the  power  spectral 
density,  equivalent  to  the  sample  variances,  is  a  smooth  surface. 

It  is  significant  to  note  that  these  assumptions  are,  in  fact,  related. 

It  can  be  shown  that  the  Fourier  transform  will  produce  uncorrelated 
samples  under  the  assumption  of  smooth  power  spectral  density 
(Papoulis,  1965;  Chapter  13). 

The  basic  two  assumptions  lead  to  the  equivalent  amplitude  and 
phase  representation.  Furthermore,  the  amplitude  is  Rayleigh  and 
the  phase  is  uniformly  distributed.  Specifically,  an  N  x  N  image  is 
decomposed  into  1/2  N?'  amplitude  and  1/2  N2  pnase  terms,  which 
are,  by  assumption,  mutually  independent.  Two  separate  coding 
schemes  were  developed  depending  on  the  transform  symmetry.  One 
coder  utilizes  the  complex  Fourier  transform.  The  conventional 
odd  and  even  function  decomposition  into  amplitude  and  phase  is 
used  by  the  second  coding  scheme  for  which  the  Walsh  transform  is 
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used.  The  two  types  of  representation  are  related  by  a  simple 
mapping,  thus  either  coding  scheme  is  sufficient  for  both  Fourier 
and  Walsh  transformations.  The  schematics  of  the  coding -decoding 
process  are  shown  in  Figure  4.  1-1. 

Detailed  descriptions  of  the  algorithms  for  both  the  Fourier 
and  Walsh  transform  are  provided  in  this  section. 

The  256  x  256  image  is  Fourier-transformed.  The  conven¬ 
tional  representations  of  ihe  Fourier  domain  are  shown  in  Figures 
4.  1-2  and  4.  1-3.  The  arrows  indicate  increasing  harmonics  in  the 
horizontal  and  vertical  directions.  The  number  pairs  in  parentheses 
indicate  the  ordering  of  the  amplitudes  (or  phases)  according  to 
harmonics.  The  discrete  fast  numerical  transform  yields  Figure 
4.  1-2.  The  more  familiar  diffraction  pattern  is  shown  in  Figure 
4.  1-3.  By  interc'.anging  the  two  halves  of  the  pattern  either  repre¬ 
sentation  can  be  easily  mapped  into  the  other  one. 

An  ordering  must  be  established  which  specifies  the  sequence 
for  the  Fourier  domain  qua.  .ization.  The  rows  are  indexed  according 
tc  the  natural  ordering.  Referring  to  Figure  4.  1-3,  the  first  row  is 
the  top  of  the  pattern  and  then  the  coder  proceeds  downward.  The 
significant  practical  advantage  of  this  scheme  is  that  the  computer 
algorithm  will  not  require  a  large  memory  block.  It  is  not  required 
to  store  more  than  a  small  fraction  of  the  discrete  transform  in 
memory;  this,  however,  depends  on  the  complexity  of  the  predicting 
algorithm. 


Figure  4.1-1.  Schematic  of  the  Coding- Decoding  Procedure 


Figure  4.  1-2.  Conventional  Fourier  Domain  Representation  I 


Figure  4.1-3.  Conventional  Fourier  Domain  Representation  II 
(Note:  Columns  128  and  -128  are  identical) 
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Within  each  row,  the  coder  starts  with  the  lowest  horizontal 

harmonic,  then  it  proceeds  to  the  right  (refer  again  to  Figure  4.  1-3, 

following  which  it  repeat,  the  process  moving  to  the  left  from  the 
center,, 

Ihe  code  words  are  generated  by  the  quantization  of  the  ampli¬ 
tude  and  phase  values.  The  phase  values  are  uniformly  quantized  . 
The  amplitude  is  companded  and  then  processed  by  a  uniform 
quantizer.  The  number  of  quantum  levels  is  set  in  linear  proportion 
to  the  variance  of  the  transform  samples.  The  number  of  quantum 
levels  for  the  phase  is  twice  as  high  as  that  for  the  amplitudes.  «h  en 
this  number  is  four  or  larger.  For  the  two-level  amplitude  quanti¬ 
zation.  eight  phase  quantum  levels  are  specified.  The  transform 
domain  variance  is  estimated  from  the  previously  quantized  amplitude 
values.  Clearly,  the  estimate  based  on  the  amplitudes  prior  to 
quantization  would  be  preferable,  however,  it  would  lead  to  an 
undecodable  process.  The  decoder  will  also  perform  the  estimation 
process  and  it  only  has  access  to  the  previously  quantized  amplitudes. 

Fstimation  of  the  variance  of  the  neat  amplitude  to  be  quantized 
follows  a  rather  simple  rule.  The  density  function  for  the  Rayleigh 
distribution  is  given  by 


P(x)  =  e 

2ttct 


x  £  0 


(4. 1-1) 


otherwise 
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The  first  two  moments  are 


E(x)  = 


(4.  1-2) 


E|x2  -  E(x)| 


VAR(x) 


(4.  1-3) 


For  each  transform  amplitude,  the  compander  needs  a,  and  the 
number  of  quantum  levels  is  determined  by  the  variance.  Equations 
(4.  1-2)  and  (4.  1-3)  can  be  rewritten  in  a  more  useful  form  as 


c 


VAR(x) 


=  ^  E<*> 

■(i-  i)e2U) 


(4.  1-4) 


(4.  1-5) 


Equations  (4.  1-4)  and  (4.  1-5)  indicate  that  the  estimate  of  the  average 
amplitude  also  specifies  the  standard  deviation  and  the  variance.  The 
amplitude  estimate  is  determined  by  averaging  the  previously  quan¬ 
tized  amplitudes  in  a  neighborhood  surrounding  the  estimate.  This 
neighborhood  is  determined  by  the  ordering  of  the  transform  domain. 

The  d.c.  value  is  transmitted  without  requantization.  The 
estimate  of  this  term  is,  therefore,  perfect.  The  estimate  of  the 
next  value  is  also  the  d.c.  value.  This  term  will  be  quantized  and 
the  reconstructed  value  is  available  for  the  estimate  of  the  n-xt 
amplitude.  The  estimate  of  the  third  value  is  the  arithmetic  mean 
of  the  d.  c.  term  and  the  first  quantized  harmonic.  For  all  other 
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values  on  the  first  row,  the  amplitude  estimate  is  the  average  of  the 
three  previously  reconstructed  terms. 

I  he  estimation  of  the  amplitudes  on  the  zero  column  (the 
column  containing  the  d.  c.  term)  is  the  exact  symmetrical  equivalent 
of  the  first  row.  All  other  estimates  are  generated  from  four  pre¬ 
viously  quantized  values  by  simple  averaging.  These  samples  are 
three  values  from  the  previous  row  and  the  just  previously  quantized 
amplitude  on  the  same  row.  Equations  (4.  1-6)  through  (4.  1-15)  are 
mathematical  forms  of  these  sample  stimates.  The  subscripts  refer 
to  the  horizontal  and  vertical  harmonic  ordering  of  Figure  4.  1-3. 


xo,o  =  x0,  0 

(4.  1-6) 

p*> 

II 

x> 

o 

o 

(4.  1-7) 

/N  /*v 

*0,2  =  '*o,0  +*0,  1)/2 

(4.  1-8) 

+*0,j-2  ^O.j.3''3- 

j  >  2 

(4.  1-9) 

*0,j  =(50,j+«  +<*0,  j  +2  +  *0,j«>/3’ 

j  <  o 

(4.  1-10) 

/\  /\ 
xl,0  =  x0,  0 

(4.  1-11) 

X2,  0  =  ^1,  0  +  x0(  0^ 

(4.  1-12) 

O'  .  /\  /N  , 

Xl,0  -  rxi.l,0  +xi-2,0  +xi-3,0)/3’ 

i  >  2 

(4. 1-13) 
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/\ 

x 


i>j  i»j-l  i ~  1 » j -  1  +  i  - 1>  j  Xi-l,j+l)/4,  i>0,  j>0 

(4.  J  -14) 


/\ 


xi,j  =  ^1.  j+1  +  Vi.j-1  +xi-i,j  +  Vi, j+!)/4'  1  >  °'  j  <  0 


(4.1-15) 


The  estimation  of  the  zero  row  [Equations  (4.  1-6)  through 
(4.  1-10)]  and  the  zero  column  [ Equations  (4.  1-11)  through  (4.  1-13)] 
are  separated  from  the  general  form  of  estimation  [Equation  (4.  1-14) 
for  the  right  and  Equation  (4.  1-15)  for  the  left  half  of  the  Fourier 
plane].  The  zero  row  and  column  usually  have  a  higher  degree  of 
energy  concentration  than  their  immediate  neighborhood,  due  to 
windowing,  and  thus  require  special  consideration. 

The  mapping  utilized  by  the  compander  is  the  distribution  func¬ 
tion  associated  with  the  appropriate  probability  density  function.  It 
is  given  by  the  following  expression  for  the  Rayleigh  distribution: 


y 


F(x)  =  1 


Xf  [0,  co] 


Its  inverse  is 


(4.  1-16) 


x 


F_1(y) 


=  a  yf- 


Z  ln( 1  -  y),  ye [0,  l] 


(4. 1-17) 


In  terms  of  the  previous  equations,  the  coding -decoding  process 
may  be  explicitly  specified  (see  Figures  4.  1-4  through  4.  1-6). 


Figure  4.1-4.  Various  Functions  Associated  with  Companding 
the  Unit  Variance  Rayleigh  Process,  (a)  Density 
Function  (Equation  4.  1-1),  (b)  Companding 
Transform  (Equation  4.  1-16),  (c)  Inverse 
Mapping  (Equation  4.  1-17) 


Adaptive  Decoding  Algorithm 
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Coding  Steps 

1)  Transmit  d.  c.  phase  and  amplitude  'perfectly" 

2)  Estimate  current  amplitude  from  those  previously  quan¬ 
tized,  utilizing  one  of  the  set  of  Equations  (4.  1-6)  through 
(4.  1-15) 

3)  Determine  variance  of  Rayleigh  distribution  from 
Equations  (4.  1-4)  and  (4.  1-5)  by  letting  E(x)^x 

j 

4)  Compand  amplitude  through  Equation  (4.  1-16) 

5)  Specify  the  number  of  quantum  levels,  2N  according  to 
the  amplitude  variance 

6)  Quantize  companded  amplitude  and  phase  by  uniform 
quantizer  and  transmit  the  appropriate  code  word.  (Its 
length  is  2N  +  1  bits  if  more  than  2-level  amplitude  quan¬ 
tizer  is  used,  otherwise  it  is  4.  ) 

7)  Utilizing  Equation  (4.  1-17),  determine  the  actual  recon¬ 
structed  amplitude  and  save  for  further  estimation 

8)  Unless  the  entire  transform  plane  is  quantized  proceed 

to  Step  2  for  the  next  amplitude  and  phase  value  pro¬ 
cessing. 

Decoding  Steps 

1)  Receive  exact  d.  c.  phase  and  amplitude 

2)  Estimate  current  variance  of  Rayleigh  distribution  from 

Equations  (4.  1-4)  and  (4.  1-5)  by  letting  E(x) 

/\  .  ,  . 

Xi,j  1S  deterrnined  via  the  estimator  Equations  (4.  1-6) 
through  (4.  1-15) 
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Determine  code  word  length  from  the  amplitude  variance 
Reconstruct  inputs  to  the  uniform  quantizer  (this  is  the 
companded  amplitude  and  phase) 

Reconstruct  amplitude  utilizing  Equation  (4.  l-,7) 

Unless  the  entire  transform  plane  is  decoded,  proceed  to 
Step  2  for  the  next  amplitude  and  phase  decoding. 

Several  important  observations  should  be  made  at  this  point. 
The  coding-decoding  process  is  clearly  decodable.  The  decoding 
must  be  done  in  the  same  order  in  which  the  encoder  operated.  In 
other  words,  selected  decoding  of  individual  code  words  or  sequence 
of  code  words  is  not  possible.  The  code  words  are  clearly  of  the 
variable-length  type.  The  set  of  binary  digits  which  represents  the 
entire  coding  process  does  not  possess  any  particular  algebraic 
properties.  It  should  be  pointed  out  that  although  the  coding  process 
is  decodable  the  actual  binary  sequence  is  not  decodable  according  to 
the  conventional  definition  of  algebraic  decodability.  The  quantiza¬ 
tion  of  the  Rayleigh  process  can  effectively  be  demonstrated  via 
input-output  diagrams  as  shown  in  Figures  4.  1-7  through  4.  1-10. 

Since  the  decoding  process  is  recursive,  the  errors  made  in 
the  decoding  process  can  be  catastrophic.  A  catastrophic  error  will 
propagate  throughout  the  decoding  process,  thus,  all  decoded  values 
will  be  in  error  past  the  one  in  the  sequence  where  the  first  error 
occurred.  The  primary  source  of  error  is  channel  noise  which  will 
be  considered  in  Section  4.  4.  A  catastrophic  error  will  occur  when 


3) 

4) 

5) 

6) 


0.2  0.3  0.5  1.0  2.0 

INPUT 

Figure  4.  1-8.  Four- Level  Quantizer 


OUTPUT 
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the  estimated  variance  is  sufficiently  incorrect  to  yield  an  incorrect 
bit  assignment.  The  result  is  loss  of  synchronization. 

Ihe  bit  assignment  is  based  on  Equation  (4.  1-5)  and  it  is  simply 
of  the  form 


N  =  [a  log2  (VAR(x))]  (4.1-18) 

As  previously  indicated,  the  amplitude  and  phase,  in  general,  are 
specified  by  2N  +  1  or  2N  +  2  bits.  The  *  is  the  proportionality  con¬ 
stant;  the  brackets  [  ]  specify  the  largest  integer  whose  value  does 
not  exceed  the  value  within  the  brackets.  Both  the  encoding  and 
decoding  algorithms  include  a  large  number  of  arithmetic  operations; 
specifically.  Equations  (4.  1-4)  through  (4.  1-17)  are  utilized  before 
Equation  (4.  1-18)  can  be  applied.  In  order  to  assure  that  the  result 
of  Equation  (4.  1-18)  is  identical  for  both  the  encoding  and  decoding 
processes,  it  is  important  that  the  sequence  and  accuracy  of  the 
arithmetic  operations  be  the  same.  The  correct  sequence  is  achieved 
by  proper  programming.  The  accuracy  consideration  is  much  more 
involved.  Clearly,  if  the  coding  and  decoding  algorithms  are  imple¬ 
mented  on  computers  of  different  word -length,  the  deviation  in  round¬ 
off  error  could  lead  to  different  bit  assignments  through  Equation 
(4.  1-18).  Even  for  the  same  computing  equipment,  the  minor  varia¬ 
tion  in  certain  arithmetic  steps,  for  example,  different  logarithm 
evaluation  for  the  coding  and  decoding  operations  could  result  in 
ambiguity.  The  ambiguity  consideration  is  important;  however,  the 
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related  difficulties  can,  again,  be  eliminate  !  by  careful  programm¬ 
ing.  In  a  universal  version  of  the  coding -decoding  algorithms,  table 
look-up  should  be  used  instead  of  functional  evaluation,  and  integer 
arithmetic  should  replace  all  floating  point  operations.  The  same 
programming  considerations  also  result  in  improved  efficiency  for 
most  general  purpose  computers. 

The  utilization  of  Equation  (4.  1-18)  for  sample  reduction  actu¬ 
ally  incorporates  the  novel  features  of  both  the  zonal  and  threshold 
approach  to  transform  coding.  Whenever  the  image  power  spectral 
density  significantly  decreases  for  higher  harmonics,  Equation 
(4.  1-18)  should  lead  to  significant  sample  reduction.  The  coding 
reduction  procedure  thus  far  outlined  is  highly  image -dependent 
(unlike  zonal  coding)  and  requires  no  additional  bookkeeping  informa¬ 
tion  (unlike  threshold  coding).  The  decoder  is  completely  uninformed 
of  the  degree  of  sample  reduction;  this  information  it  can  only 
ascertain  upon  completion  of  the  decoding  process. 

The  natural  form  of  the  image  power  spectral  density  may  not 
lead  to  a  sufficient  degree  of  sample  reduction.  Appropriate  applica¬ 
tion  of  the  filtering  process  in  Figure  4.  1-1  discussion  of  which 
was  delayed  to  the  present,  can  significantly  alter  the  bit  rate. 
Generally,  the  filter  function  is  of  the  low-pass  form  (it  can  also  be 
image-dependent).  The  coder -decoder  will  operate  on  the  modified 
power  spectrum.  Thus  any  degree  of  sample  reduction  can  be 
achieved  by  selecting  the  appropriate  filter.  It  should  be  observed 
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that,  again,  the  application  of  the  filter  requires  no  bookkeeping  bits 
*u  the  receiver.  In  fact,  fie  decoding  algorithm  has  no  information 
as  to  the  type  or  structure  of  the  filter  if,  in  fact,  one  was  utilized. 

The  small  amount  of  white  noise  in  the  image  significantly 
alters  the  power  spectral  density  for  the  higher  harmonics.  In  fact, 
for  the  higher  harmonics,  the  sample  variances  are  basically  speci¬ 
fied  by  the  noise  spectral  density.  The  coder  cannot  differentiate 
between  the  true  image  and  noise.  Application  of  the  filter  may  lead 
to  its  more  conventional  role,  that  is,  to  increase  the  S/N  of  the 
image. 

The  adaptive  philosophy  can  easily  be  extended  to  other 
orthogonal  decompositions.  The  Walsh  transform  was  utilized  for 
this  implementation.  The  conventional  schematic  representation  of 
the  sequency-ordered  Walsh  transform  is  shown  in  Figure  4.  1-11. 

The  Walsh  transform  of  the  256  x  256  matrix  is  another  256  x  256 
matrix.  The  "unconventional"  phase  concept  permits  the  description 
of  the  transform  plane  by  an  equal  number  of  phase  and  amplitude 
terms.  The  following  definition  was  used.  In  Figure  4.  1-11  each 
row  is  considered  as  128  number  pairs.  These  pairs  are  used  for 
the  amplitude  and  phase  definition  in  a  similar  manner  to  the 


The  effects  of  noise  on  coding  arc  further  discussed  in  subsection 
4.  2. 


106 


Fourier  case.  Let  (A. <2j_  j,  a.^  2.)  be  one  such  pair,  then  the 
corresponding  amplitude  and  phase  values  are  defined  respectively 


as  x 


/  2  2  \ 1/2 
i,j  "  (ai,2j-l  +ai,2j  )  tan  0. 


=  a. 


i,j  i.  2j  -  1  * 


The 


corresponding  representation  of  the  Walsh  amplitudes  is  shown  in 
Figure  4.  1-12. 

Once  the  amplitude  plane  is  specified  it  is  obvious  that  the 
various  equations  used  for  coding  the  Fourier  plane.  Equations 
(4.  1-4)  through  (4.  1-17),  are  equally  appropriate.  There  are  only 
two  basic  differences:  (a)  the  estimator  equations  for  the  negative 
(left)  side  are  not  needed  and  (b)  the  sequence  of  operation  must 
correspond  to  the  symmetry  of  Figure  4.  1-12.  The  coder  will 
again  proceed  downward  row  by  row.  Within  each  row  it  will  always 
proceed  from  the  zero  column  to  the  right. 

Once  the  coding  algorithm  is  adjusted  for  the  two  minor  differ¬ 
ences  listed  above,  the  coding  and  decoding  steps  listed  for  the 
Fourier  domain  are  equally  valid  for  the  Walsh  domain.  Similarly, 
the  various  comments  relating  to  bit  assignment,  ambiguity,  and 

implementation  of  computer -to -computer  communications  are  equally 
valid  for  the  Walsh  coder. 

4.2  Effects  of  Noise  in  the  Original  Image 

Noise  effc -ts  are  considered  in  this  section  via  a  simplified 
analytic  model. 

For  the  purpose  of  analysis  the  image  correlation  is  modeled 
by  the  simple  exponential  Markov  expression 


_ 
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R(x,y]  (4.?.. 

Further  simplification  is  obtained  by  the  assumption  of  identical 
horizontal  and  vertical  statistics.  Thezefore,  lei  B  =  a  and 

°1  =  c2  =  t*ien 

R  lx,  y)  =  +  M>  =0IXI  +  IVI  (4.  2. 


Note  also  that  a  >  0,  p  <  1,  a  =  -lnp. 

The  application  of  the  MarKov  model  [Equation  (4.2-2)]  leads 
to  interesting  quantitative  results.  In  the  following,  the  image  is 
assumed  to  be  normalized  such  that  its  mean  is  zero  and  its  variance 
is  unity.  It  is  assumed  that  the  image  is  corrupted  by  additive  white 
noise  of  power  spectral  density  N.  The  average  S/N  in  the  image  is, 
therefore,  1/N.  The  local  S/N  in  the  transform  domain,  denoted  by 
Q,  is  the  ratio  of  the  image  and  noise  power  spectral  densities: 

Q(u,  v)  =  S(u,  v)/N  (4.2-3) 

Restricting  the  discussion  to  the  Fourier  transform,  the  power  spec¬ 
tral  density  is  given  by 

CO 

S(u,v)  =  //R(x,y)e--^UX-V)dxdy  (4.2-4) 


Utilizing  Equation  (4.2-2)  for  R, 
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Q(u,  v) 


(4.2-5) 


Before  proceeding  with  Equation  (4.2-5)  the  problem  must  be 
discretized.  As  before,  the  image  is  assumed  to  be  properly  sam¬ 
pled,  e.g.,  no  '’iasiiig,  on  a  rectangular  grid  at  locations 


(xn*ym)  =  (nAx.mAy);  n,  m  =  0,  ±1,  ±2,  (4.2-6) 


The  appropriate  frequency  band  limits  in  the  transform  domain 
are  [-  l/2Ax,  +  l/2Ax]  and  f  -  l/2Ay,  l/2Ay]  for  the  horizontal  and 
vertical  directions,  respectively.  For  computational  convenience, 
let  Ax  =  Ay  =  1.  Thus,  both  the  horizontal  and  vertical  extent  of  the 
frequency  domain  is  -  1/2  to  +  1/2. 

The  behavior  of  Q  ip  considered  along  the  diagonal  in  the  fre¬ 
quency  plane,  e.  g.  ,  u  =  v  =  f.  Whenever  the  noise  dominates, 

Q(u.v)  <  1.  Letting  Q(f,  f)  =  1,  one  can  solve  for  the  transition 
region.  Considerable  simplification  is  achieved  by  the  assumption 
that  at  Q(f,  f)  =  1,  2nf  »  a.  The  latter  inequality  is  realistic  for 
most  images  and  it  will  be  demonstrated  for  the  specific  example 
utilized  in  this  section.  Equation  (4.2-5)  can  be  rewritten  according 
to  the  previous  assumptions  as 


N  =  4o<^/(2TTf)^ 


(4.2-7) 


therefore 


I 


f  = 


(*2/4h4N)1/4 


Since  4f 


400 
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(4.2-8) 


f  = 


^2. 5  x  10'3 


1/4 


(4.2-9) 


For  the  numerical  utilization  of  Equation  (4.2-9),  a  and  N 
must  be  specified.  Let  N  =  0.  001  and  or  =  0.  05  corresponding  to 
0  =0.95  in  Equation  (4.2-2).  The  value  used  for  N  is  very  con¬ 
servative  since  it  corresponds  to  the  image  S/N  of  1000.  The 
p  =  0.  95  is  a  typical  value.  The  test  images  in  this  dissertation 
have  an  average  sample-to-sample  correlation  approximately 
corresponding  to  this  value.  The  evaluation  of  Equation  (4.2-9) 
for  specified  values  leads  to  the  following: 


f  =  [2.  5  x  10'3  x  103  x  (0.  05)2] 

*  0.  1(60) 1  /4  =  0.27  %  (j)  (0.5) 

According  to  this  numerical  demonstration,  at  1/2  of  the 
highest  vertical  or  horizontal  harmonic  the  image  power  spectral 
density  drops  below  the  noise  level. 

Further  observations  are  also  in  order.  Note  should  be  taken 
that  a  =  0.  05  «  0.  27  x  2tt,  thus  the  simplification  that  led  to  the 
derivation  of  Equation  (4.2-7)  was,  in  fact,  permitted.  One  can 


also  note  that  whenever  the  same  implication  is  allowed,  the  lim 
of  constant  S/N  in  the  transform  plane  are  parabolas.  From 
Equation  (4.  2-5),  letting  a  «  2nu,  *  «  2nv, 


uv  =yo2/(4nr4NQ)  (4.2-10) 

Equation  (4.2-10)  is  the  function  of  a  parabola,  whenever  the  right- 
hand  side  is  a  constant. 

The  maximum  value  of  Q  is  at  the  u  =  v  =  0  location  in  the 
frequency  plane.  It  is 


Q(0.0)  =  4/o2N  =  1.6  x  106 


(4.2-11) 


for  the  previously-specified  values  of  a  and  N.  The  demonstrated 
example  indicates  that  the  presence  of  even  a  small  amount  of 
white  noise  will  have  a  very  significant  effect.  In  this  example,  the 
majority  of  transform  domain  samples  are  below  the  noise  level 
despite  the  fact  that  the  noise  level  is  approximately  six  orders  of 
magnitude  below  the  peak  of  the  power  spectral  density. 

The  previous  analysis  can  be  easily  extended  to  the  case  where 
the  image  correlation  model  is  isotropic.  For  this  case 


R(*,y)  =  R^v42  +  VA  =  eW<*2  +  y2>  (4.2.12) 

Letting  (x2  +  y2)1/2  =  r,  and  (u2  tv2)1'2  =  (>  Equation  (4_2_4)  ^ 


replaced  by  the  Hankel  transform: 


Ill 


2nr  R(r )  JQ(2nr)  dr 


(4. 2-13) 


and  thus 


1  2  2  -3/2 

Q(f)  =  jj  (2n«)  [a  +  (2nf)2] 


(4.2-14) 


as  in  the  derivation  of  Equation  (4.  2-7)  the  inequality  2nf  »  Q  can  be 
used,  therefore  whenever  Q(f)  =  1, 


N  =  f’3  (4.2-15) 

(2n) 


Furthermore, 


f  = 


a 

(2n)2  N 


(4.2-16) 


The  utilization  of  the  previously-specified  parameters  (a  =  0.05  and 
N  -  0.  001)  indicates  that  f  >  1/2,  therefore  the  transform  domain 
S/N  in  this  instance  will  remain  above  unity  in  its  region  of  definition. 

Although  both  of  the  previous  models  can  be  expected  to  deviate 
from  the  actual  image  power  spectral  density,  the  qualitative  results 
are  useful  in  that  they  demonstrate  the  importance  of  image  noise. 

4.  3  Pictorial  Examples 

The  coding  procedures  previously  outlined  in  this  chapter  were 
programmed  for  computer  implementation.  The  results,  using  the 
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monochrome  test  images  of  Appendix  A,  are  shown  in  this  sub¬ 
section. 

The  logarithmic  amplitude  and  linear  phase  displays  are  shown 
in  Figure  4.3-1  for  the  Fourier  and  Walsh  transforms.  The  same 
figure  includes  a  demonstration  of  entropy  associated  with  these 
phase  images.  To  make  the  visual  comparison  between  the  two 
transforms  more  meaningful,  the  conventional  Walsh  presentation 
(Figure  4.  1-11)  is  remapped  and  is  shown  according  to  the  schema¬ 
tics  of  the  conventional  Fourier  display.  All  pictorial  Walsh  domain 
presentations  in  this  dissertation  are  done  in  this  manner.  The 
entropy  images  are  obtained  in  two  steps.  First,  the  phase  range 
[-  tt,  tt]  is  uniformly  quantized  by  a  64-level  (6  bit)  quantizer.  Next, 
the  two-dimensional  probability  density  function  (e.  g.  ,  histogram) 
corresponding  to  the  simultaneous  occurrence  of  phase  values 
corresponding  to  adjacent  row  neighbors  is  calculated.  The  a<  tual 
entropy  map  is  obtained  by  taking  the  base  2  logarithm  of  the  two- 
dimensional  histogram.  The  actual  entropy  value  corresponding  to 
this  map  is  obtained  by  summing  all  4096  elements  and  it  is  11.99 
bits  (the  maximum  possible  is  12  bits)  for  both  transforms.  The 
obvious  conclusion  is  that  the  various  phase  values  are,  in  fact, 
uncorrelated.  The  higher  intensity  level  along  the  phase  image 
diagonals  does  indicate  a  small  amount  of  residual  correlation. 

The  processed  images  are  shown  in  Figures  4.3-2  and  4.3-3 
corresponding  to  the  Fourier  and  Walsh  transforms,  respectively. 


(a)  Amplitude*  (Fourier) 


(c)  Phase  (Fourier) 


(d)  Phase  (Walsh) 


(e)  Entropy  (Fourier  Phase) 


(1)  Entropy  (Walsh  Phas 


Figure  4.3-1.  Transform  Domain  Display  of  GIRL  Image 

This  page  is  reproduced  at  the 
back  of  the  report  bv  a  different 
reproduction  method  to  provide 
l>ctter  detail. 
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(a)  M.S.E.  2.4%;  bit  rate  0.39 


(b)  M.S.E.  1 . 6%;  bit  rate  0.  68 


(c)  M.S.E.  1.  26%;  bit  rate  0.  31  (d)  M.S.E.  0.  78%;  bit  rate  0 .  66 


Figure  4.3-2.  Coding-Decoding  Examples  (Fourier  Transform). 
The  Mean  Squared  Error  (M.S.E.)  is  Normalized 
Relative  to  Energy  in  Original  Image 

This  page  is  reproduced  at  the 
back  of  the  report  by  a  different 
reproduction  method  to  provide 
l>etter  detail. 


(a)  M .  S.  E.  3.  6%;  bit  rate  0.51 


(b)  M.S.E.  2.6%;  bit  rate  0.75 


(c)  M.S.E.  1. 48%;  bit  rate  0.  50  (d)  M.S.E.  1 . 07%;  bit  rate  0 .  73 


Figure  4.3-3.  Coding-Decoding  Examples  (Walsh  Transform). 
The  Mean  Squared  Error  (M.  S.  E.  )  is  Normalized 
Relative  to  Energy  in  Original  Image 


This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
better  detail. 
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In  the  coding -decoding  process,  the  image  is  always  converted  into 
a  sequence  of  binary  digits  corresponding  to  the  actual  data  rate. 

The  decoder  uses  the  same  sequence  as  its  input. 

In  Figure  4.  3-4,  the  decoded  transform  planes  are  shown.  For 
this  case,  the  sample  reduction  is  obtained  by  rotationally  symmetric 
low-pass  filtering.  It  should  again  be  stated  that  the  decoder  is 
uninformed  about  the  type,  or  even  the  existence,  of  this  low-pass 
filter. 

Typical  examples  of  the  "dynamically"  determined  bit  planes 

are  shown  in  Figures  4.3-5  and  4.3-6. 

Typical  performance  curves  are  shown  in  Figure  4.3-7.  The 

various  curves  were  generated  according  to  the  following  procedure. 

~  <0 

Let  T(p,  0 )  and  7(p,  0)  represent  the  original  and  decoded  image 
transforms  in  polar  coordinates;  also  the  normalization  relative  to 
integrated  variance  is  assumed: 

2tt  00 

f  f  T(p,  0)  p  dp  de  =  1  (4.  3-1) 

"o  e 

The  lower  limit  e  indicates  that  the  d.c.  term  is  excluded  in  the 
integration.  The  letter  designations  a  through  e  correspond  to  the 
following  five  functions  designated  as  through  Z^,  respectively: 

2tt  o 

Za(P)  l  l  lT<S’61!2 


sds  d@ 


(4.3-2) 


r 
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(c)  Walsh  Amplitude  (d)  Walsh  Phase 


Figure  4.3-4.  Examples  of  Decoded  Transform  Planes 
for  GIRL  Image 

This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
better  detail. 
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(a)  Fourier  Domain 
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HARMONICS 

(b)  Walsh  Domain 

Figure  4.3-7.  Typical  Performance  Curves  for  Coding- 
Decoding  Examples  (see  text  for  various 
letter  designations) 
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zb(P)  =  1  -  za(0) 


(4.3-3) 


2n  p 


zc(p)  =  f  y  I  T(s,  e)  -  T(s,  e)  2  sds  do  (4.3-4) 


o  c 


Zd""’sk£Z.,'» 


(4.3-5) 


Ze«»'  '£i  £  2c'-» 


(4.3-6) 


The  curves  in  Figure  4.3-7  were  generated  from  the  discretized 
versions  of  Equations  (4.3-2)  through  (4.3-6).  These  functions  con¬ 
vey  considerable  information  about  the  coding  process  (although  in  a 
forced  rotational  symmetry).  Z  is  the  integrated  transform 

variance.  Z,  corresponds  to  the  truncation  error.  The  integrated 
D 

overall  coding  error  is  Z^.  The  image  power  spectral  density  is 

Z  The  local  (in  the  transform  plane)  coding  error  is  given  by  Z 
d 

The  curves  Z  ,  and  Z  merge  at  the  location  of  the  low-pass  filter 
d  e 

boundary. 

In  Figure  4.3-8,  the  sample  reduction  is  obtained  by  dis  - 
carding  transform  samples  whenever  the  amplitude  is  below  a  cer¬ 
tain  value.  The  subsequent  coding  is  the  same  as  in  previous 
examples . 

The  influence  of  apodizing,  a  ten-element  tapered  window  in 
this  case,  is  demonstrated  in  Figure  4.3-9. 


r 
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(a)  M.S.E.  1.  83%;  bit  rate  0.  6  (b)  M.S.E.  1 .  1  8%;  bit  rate  0 .  42 


(c)  Decoded  Transform  Plane  (d)  Decoded  Transform  Plane 

for  COUPLE  IMAGE  for  GIRL  Image 

II 

Figure  4.3-8.  "Threshold"  Coding  Experiment  (Fourier 

Transform).  The  Mean  Squared  Error 
(M.S.E.)  is  Normalized  Relative  to 

Energy  in  Original  Image 

jl 

This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 

L 

—  ■■  ■  ■  w-  *»’*■ 

123 


(a)  Decoded  Image  (0.4  bit) 


(b)  Original  Amplitude  (c)  Decoded  Amplitude 


(d)  Original  Phase 
Figure  4.3-9.  Coding 


(e)  Decoded  Phase 
Experiment  with  Apodizing  (Fourier) 


l.  u'k  K  iS  r<'l,ro,'l"c,’<l  «  the 
'!u'  r,  l’'",t  hv  a  different 

Stter  S.  mt',h<X'  *° 
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4.4  Channel  Coding  Considerations 

Thor,  arc  two  basic  philosophies  concerning  channel  noise 
sensitivity  evaluation  of  the  picture  coder.  In  one  case,  channel 
errors  are  permitted  to  occur  in  the  source  coded  data  and.  conse¬ 
quently,  the  reconstructed  image  will  be  affected  by  the  channel 
noise.  A  'well-behaved"  coding  process  will  be  insensitive  to 
channel  errors.  The  second  approach  implies  the  requirement  for 
channel  coding  and  in  effect  assume,  that  by  proper  channel  coding 

error-free  transmission  is  possible.  The  author  is  a  strong  be- 
liever  in  the  latter  philosophy. 

Lack  of  sensitivity  to  channel  errors  is  a  desirable  image 
coding  feature.  It  is  easy  to  demonstrate,  however,  that,  in  general, 
efficient  data  compression  and  insensitivity  to  channel  errors  are 
contradictory  concepts.  The  fundamental  theoretical  basis  for  any 
data  compression  procedure  is  redundancy  in  the  data  source.  The 
fact  that  the  source  output  is  correlated  permits  representation  of 
the  source  in  "compressed"  form.  An  efficient  data  compression 
procedure  remove,  the  existing  source  correlation  and  produces  an 
output  which,  by  design,  will  be  uncorrelated.  In  the  binary  repre¬ 
sentation  of  the  compressed  data,  each  bit  will  acquire  a.  added 
importance  and  its  reversal  is  more  apt  to  degrade  the  quali>y  of 
the  reconstructed  data  than  a  similar  occurrence  of  error  in  the 
original  (uncompressed)  data. 
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It  is  not  surprising  that  most  efficient  image  coding  algorithms, 
particularly  contour  coding  techniques,  are  very  sensitive  to  channel 
errors.  For  the  latter  method,  a  single  bit  error  is  likely  to  prevent 
the  entire  image  reconstruction. 

Source  coding  removes  the  source  redundancy,  conversely, 
redundancy  is  reintroduced  by  channel  coding.  For  the  channel  coding 
procedure,  it  is  important  that  the  input  to  it  be  of  a  particular 
statistical  structure.  Since  the  channel  is  unlikely  to  have  been 
designed  to  accommodate  any  particular  source  redundancy,  it  is 
anticipated  that  channel  usage  is  optimum  when  its  input  is  statisti¬ 
cally  uncorrelated. 

According  to  *he  above  statement,  source  redundancy,  i.e., 
finite  memory,  ib  undesirable  for  subsequent  channel  coding  which 
assumes  a  memoryless  source.  The  PCM  form  of  the  image  is 
highly  correlated.  The  high  degree  of  correlation  can  be  demon¬ 
strated  in  the  binary  equivalent  of  the  image. 

A  quantitatively  meaningful  demonstration  of  correlation  is 

the  correlation  function  calculated  from  the  binary  equivalent  of 

image  segments.  Each  value  of  the  correlation  function  R.  is 

J 

determined  from  a  data  segment  of  N  values: 


N-j 

Rj  £  |xi  -x),*i+j  -*> 

!  _  1  J 


(4.4-1) 
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and 


(4.4-2) 


The  data  (x.)  are  the  actual  binary  representation  of  the  image, 
that  is,  each  x.  is  either  1  or  0.  Calculation  of  the  correlation  func¬ 
tion  for  N  =  10,000  from  the  PCM  "Girl"  image  is  shown  in  Figure 
4.  4- la.  The  structural  form  of  FG  in  Figure  4.4-la  is  consistent 
with  the  eight-bit  representation,  each  relative  maximum  occurs  at 
multiples  of  eight.  is  the  variance  of  the  binary  stream. 

The  similar  calculation  of  the  correlation  function  for  the 
source  coder  output  of  Chapter  4  is  shown  in  Figure  4.  4- lb.  The 
result  indicates  that  the  source  coding  algorithm  output  is  equivalent 
to  a  memoryless  source. 

Lack  of  correlation  and  the  significant  bit  reduction  in  the  out¬ 
put  of  the  image  coding  procedure  indicate  that  the  compressed  data 
are  expected  to  be  sensitive  to  channel  errors.  This  sensitivity  is 
evident  also  upon  careful  examination  of  the  coding  process  of 
Section  4.  1.  Both  the  number  and  the  location  of  quantum  levels  arc 
"dynamically"  determined.  Channel  error  can  affect  both  of  these 
quantization  parameters.  Erroneous  determination  of  the  number  of 
quantum  levels  is  a  catastrophic  error.  The  synchronization  in  the 
adaptive  coding  procedure  will  be  lost  and  all  subsequent  values  will 
be  drastically  altered. 


U  10  20  30  40  50  60  70  80  90  100 

LAG  | bits) 

(b)  Adaptive  Transform  Code 

Figure  4.4-1.  Typical  Bit  Stream  Correlation  Properties 

of  GIRL  Image 
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It  can  be  seen  that  the  adaptive  phase  coder  output  should  not 
be  transmitted  over  noisy  channels  without  channel  coding.  Although 
the  degree  of  image  compression  depends  on  the  image  correlation 
via  the  generalized  power  spectral  density,  it  can  be  controlled  to 
virtually  any  reasonable  value  by  an  appropriate  linear  filter.  Thus, 
the  additional  bandwidth  requirement  for  the  channel  coder -introduced 
redundancy  can  easily  be  offset  by  the  image  coder  at  the  expense  of 
additional  low-pass  filtering. 

by  relaxing  the  requirement  for  complete  adaptivity  the  coding 
algorithm  can  be  made  less  sensitive  to  channel  eirors.  The  prior 
specification  for  bit  assignment  according  to  some  conventional 
models  retains  the  flexibility  for  determination  of  quantum  levels, 
however,  catastrophic  errors  resulting  from  loss  of  synchronization 
can  no  longer  occur. 

Using  the  polynomial  surface  fit  for  the  optimum  quantization 
parameters  (subsection  3.  7)  would  also  avoid  catastrophic  errors, 
providing  that  the  transmission  of  the  appropriate  coefficients  is 
without  errors.  Neither  technique  was,  however,  employed.  In 
either  case,  the  additional  complexity  could  be  avoided  by  proper 
channel  coding,  which  probably  would  require  less  total  effort. 


5.  EXPERIMENTAL  RESULTS,  II  (COLOR) 


Transform  coding  techniques  for  monochrome  images  have 
been  successfully  utilized  by  various  researchers.  For  other,  more 
complex,  types  of  images,  redundancy  exists  in  parameters  other 
than  the  two  spatial  variables.  This  chapter  considers  the  extension 
of  the  algorithm  of  Chapter  4  to  color  images,  while  in  Chapter  6 
the  coding  algorithm  implementation  is  for  a  sequence  of  time-varying 
images . 

Further  extensions  could  include  the  simultaneous  considera¬ 
tion  of  color  and  time,  however,  it  was  not  done  here.  It  should  be 
emphasized  that  for  even  the  three-dimensional  data,  e.g.,  color, 
or  time-dependent  images,  the  experimental  difficulties  become  quite 
significant.  The  generation  and  calibration  of  properly  registered 
frames  is  a  major  effort  by  itself.  Similarly,  the  display  and  record¬ 
ing  of  a  color  image  requires  a  g'eat  deal  of  additional  hardware  and 
care  as  compared  to  monochrome  images.  Furthermore,  the  third 
dimension  significantly  increases  the  data  handling.  The  experi¬ 
mental  difficulties  listed  above  have  kept  research  on  color  and 
frame-to-f rame  coding  at  a  fraction  of  the  effort  extended  to  the 
monochrome  case. 

5.  1  Color  Image  Representation 

A  passive,  opaque  (non-emitting)  object  becomes  "visible"  by 
reflecting  radiation  which  is  incident  on  it.  The  reflection  process 
is  selective,  thus,  the  relative  amount  of  reflected  energy  is 
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dependent  on  the  local  characteristics  of  the  object  and  it  also  depends 
on  the  spectral  distribution  of  the  incident  radiation.  The  physical 
reflection  process,  in  effect,  specifies  the  image. 

lhe  spectral  dependence  is  "integrated  out"  for  monochrome 
images.  The  exact  characterization  of  actual  visual  scenes  requires 
the  specification  of  the  spectral  component  which  is  accomplished  by 
the  third  image  variable,  the  wavelength  (X).  Symbolically,  the  color 
image  is  an  analog  function  of  three  variables,  I(x,  y,  X).  Prior  to 
the  coding  procedure,  the  image  must  be  sampled  along  the  spectral 
axis  in  addition  to  the  discretization  of  the  two  spatial  coordinates. 

The  sampling  procedure  applied  to  the  wavelength  very  strongly 
depends  on  the  ultimate  purpose  for  which  the  image  was  recorded. 
Formally,  the  spectral  sampling  can  be  written 

00 

Kx.y.j  )=f  r.(X)  I(x,y,  X)  dX,  j  =  1,  ...  N  (5.1-1) 

The  spectral  aperture,  "  r^(X)  determines  the  weighting  of  the 
spectral  components  for  the  determination  of  the  j-th  sample.  The 
number  of  samples,  N,  depends  on  the  application.  Equation  (5.  1-1) 
can  represent  the  monochrome  image  of  Chapter  4  by  specifying 
N  -  1  and  r^  to  be  a  constant  over  the  visible  portion  of  the  spectrum. 

The  value  of  N  may  be  in  excess  of  20  for  what  is  generally 
referred  to  as  multispectral  data.  The  functional  form  of  r.,  in  this 
case,  is  usually  an  approximate  delta  function  ctntered  at  a  specific 
wavelength  value,  X..  Not  all  X.'s  are  necessarily  in  the  visible 
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spectral  region.  Multispectral  data  are  generally  utilized  in 
computer  classification  algorithms  rather  than  for  the  actual  repro¬ 
duction  of  the  visual  scene  for  human  viewing. 

The  spectral  sampling  is  greatly  simplified  for  the  case  when 
the  purpose  of  image  recording  is  for  subsequent  display  for  human 
visual  viewing.  The  human  eye  does  not  respond  individually  to  the 
infinite  number  of  spectral  elements  present  in  a  visual  scene.  It  is 
rather  a  triplet  of  photoelectric  detectors  whose  individual  responses 
cover  the  low  (red),  medium  (green)  and  high  (blue)  spectral  regions. 
The  human  visual  process  determines  the  color  on  the  basis  of  the 
simultaneous  "readings"  of  these  detectors  (Cornsweet,  1970). 

In  effect,  the  human  eye  perceives  the  complex  visual  scenes 
corresponding  to  its  three  detectors.  Within  this  somewhat  over¬ 
simplified  model  the  eye  performs  the  mapping  of  the  continuous 
wavelength  axis  into  a  set  of  three  values.  The  mapping  is  of  the 

form  of  Equation  (5.  1-1)  and  it  is  given  by  the  following  three 
equations . 


R(x,y)  =J 

f  rR  (X )  I  (x,y,  \)  d\ 

0 

(5.  1-2) 

G(x,y)  =/ 
*'( 

r® 

rr  (\)  I  (x,y,  \)  d\ 

)  U 

(5.  1-3) 

B  (x,  y )  =J 

f  (M  I  (x,y,  \)  d\ 

(5.  1-4) 

(5. 1-4) 
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Letter  indices  R,  G,  and  B  designate  the  spectral  region  in 
which  the  appropriate  "eye  detector"  reaches  its  maximum  response 
(red,  green,  and  blue,  respectively).  Equations  (5.  1 -2)  through 
(5.  1-4)  imply  the  nonunique  spectral  sensitivity  of  the  human  eye. 
According  to  these  equations,  the  change  of  I(x,  y,  \)  will  not  be 

perceived  as  long  as  the  left  sides  of  these  equations  are  not 
altered. 

From  the  standpoint  of  this  bandwidth  reduction  algorithm,  the 
coding-decoding  process  will  simultaneously  consider  the  three 
image  signals  R(x,  y),  G(x,  y),  and  B(x,y).  The  redundancy  reduction 
is  achieved  by  considering  the  correlation  among  the  three  color 
planes  in  addition  to  the  spatial  correlation  within  each  plane. 

5.2  Description  of  the  Algorithm 

The  coding  scheme  for  the  multidimensional  data  which  is 
presented  in  this  dissertation  can  be  put  into  the  simple  form,  shown 

in  Figure  5.2-1,  in  a  manner  similar  to  the  monochrome  case  of 
Figure  4.  1-1. 

The  three-dimensional  transform  of  the  R,  G,  B  planes  yields 
three  transform  planes  1 ^  ly  By  assumption,  the  samples  are 
uncorrelated  within  each  transform  plane  as  well  as  among  the 
various  transform  planes. 

The  actual  implementation  utilized  the  Fourier  transform.  The 
three-dimensional  transform  is  performed  in  two  stages.  The  con¬ 
ventional  two-dimensional  Fourier  transform  is  applied  to  the  R, 
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G,  B  planes  individually.  Each  triplet  of  complex  values  corres¬ 
ponding  to  identical  locations  in  the  three  transformed  planes  is  sub¬ 
jected  to  the  one -dimen s ional,  three-point  complex  Fourier  trans¬ 
form.  The  3x3  complex  unitary  transform  matrix  is  shown  in 
Figure  5.  2-2. 

11  1  -| 

/=  1  °*5  (-1  +j/3)  -0.5  (1  +j/3) 

1  -0.  5  (1  +  j/3)  0.5  (-1  +  j/3) 

Figure  5.2-2.  Three -Element  Fourier  Transform  Matrix 

The  structure  of  the  3x3  complex  matrix  indicates  that  the 
first  of  the  three  final  transform  planes  (T^  is  simply  the  average 
of  the  two-dimensional  transform  of  the  R.  G,  B  planes,  respectively. 
In  effect,  the  first  plane,  1^,  contains  the  brightness,  or  luminance, 
information.  The  T2  and  Tj  planes  designate  the  fluctuation  around 
the  average  of  the  three  planes  and  thus  represent  the  chrominance 
information.  It  is  not  necessary,  however,  to  make  reference  to 
luminance  and  chrominance  designations  in  order  to  implement  the 
coding  procedure. 

The  coding -decoding  procedure  of  Chapter  4  is  applied  to  the 
three  transform  planes  (Ij,  I2,  I3 )  individually.  The  only  inter¬ 
dependence  among  the  three  separate  coding  processes  is  that  the 


scale  factor  that  relates  the  number  of  quantization  levels  to  the 
sample  variance  is  determined  for  Ij  and  the  same  value  is  utilized 
for  T2  and  Ij  as  well. 

As  in  the  procedure  utilized  for  the  monochrome  case  in 
Chapter  4,  the  three  filters  can  be  used  by  the  transmitter  to  modify 
the  coding  process.  The  receiver  will  adapt  to  the  filtered  planes 
without  any  control  information.  The  close  similarity  of  the  R,  G, 

B  and  their  two-dimensional  transforms  implies  that  the  largest 
image  energy  component  will  be  concentrated  on  the  first  transform 

plane. 

The  low-pass  filtering  effects  of  the  Tj,  I^,  and  1^  planes  are 
similar  to  the  procedure  previously  used  for  the  Y,  I,  Q  system 
(Pratt,  1971).  The  "luminance"  plane  1^,  in  effect  represents  the 
spatial  resolution,  which  will  be  degraded  by  a  high  degree  of  low- 
pass  filtering.  The  "chrominance"  planes  T2  and  T3  can  be  subjected 
to  rather  strong  low-pass  filtering  without  serious  image  degradation. 
The  replacement  of  every  value  by  zero  in  the  I2  and  I.j  planes 
reduces  the  color  imag-t  to  a  monochrome  equivalent.  This  mono¬ 
chrome  image  is  simply  the  average  of  the  R,  G,  and  B  signals. 

The  value  of  the  adaptive  nature  of  the  color  coding  process 
as  indicated  in  Figure  5.2-1  cannot  be  overemphasized.  The  appro¬ 
priate  filters  can  be  specified  for  a  specific  color  system.  For  the 
general  case,  the  R,  G,  B  signals  may  be  referenced  to  a  wide 
variety  of  primaries.  The  degree  of  low-pass  filtering  which  may 
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.je  tolerated  will  depend  on  the  original  color  digitizing  equipment 
and  the  subsequent  calibration  procedure,  if  any.  The  low -pass 
filter  band  limits  are  implemented  by  the  transmitter  and  the  spec¬ 
ification  is  very  lively  the  result  of  human  visual  inspection.  When¬ 
ever  the  cost  in  effort  associated  with  color  image  transmission  is 
high,  optimization  of  the  three  filters  is  likely  to  require  consider¬ 
able  effort. 

The  significant  property  of  the  adaptive  procedure  is  that  once 
the  transmitter  decides  on  an  optimized  set  of  filters  (i.e.,  the 
tolerance  of  the  transform  planes  to  low-pass  filtering  has  been 
determined)  none  of  this  information  is  required  by  the  receiver. 

All  information  bits  relate  to  the  quantized  transform  domain,  and 
no  bookkeeping  information  is  required. 

The  same  comments  made  in  Chapter  4  regarding  advantages 
and  disadvantages  apply  for  the  adaptive  color  coder  as  well.  The 
adaptive  procedure  includes  the  benefits  of  both  zonal  coding  (non- 
uniform  bit  assignment  and  quantum  levels)  and  threshold  coding 
(adaptivity  in  deciding  which  regions  can  be  discarded).  The  funda¬ 
mental  disadvantage  of  the  adaptive  coder  is  the  variable  buffer 
requirement  for  the  receiver.  The  bit  rate,  or  equivalently,  the 
degree  of  compression,  is  determined  by  the  transmitter  and  only 
after  decoding  will  this  information  be  available  to  the  receiver. 

The  three -primary  color  system  utilized  the  three-dimensional 
Fourier  decomposition  only.  The  one -dimensional,  3-point  Walsh 
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transform  is  not  defined.  An  alternate  composite  system  could 
include  the  two-dimensional  Walsh  transform  of  the  R,  G,  B  planes 
followed  by  the  one -dimensional  Fourier  transform.  The  alternate 
system  was  not  actually  implemented.  If  the  number  of  the  color 
planes  N  is  of  the  form  N  =  2n,  where  n  is  an  integer,  the  Walsh 
decomposition  is  possible  for  the  third  dimension.  The  three- 
dimensional  Walsh  decomposition  of  Chapter  6  could  be  directly 
utilized  for  multispectral  data  of  four  input  planes. 

5.3  Pictorial  Examples 

As  with  the  monochrome  coder  of  the  previous  chapter,  the 
color  coding  algorithm  has  been  programmed.  The  three-dimensional 
Fourier  transform  was  utilized. 

Figure  5.3-1  shows  the  three  transform  planes  corresponding 
to  the  three-dimensional  transform.  Results  of  the  coding -decoding 
experiments  are  shown  in  Figure  5.3-2.  Tristimulus  color  planes 
for  one  of  the  decoded  images  are  shown  in  Figure  5.  3-3.  Decoded 
transform  planes  for  a  typical  case  are  shown  in  Figure  5.  3-4. 

Ihis  figure  also  indicates  the  varying  amount  of  low-pass  filtering 
in  the  different  transform  planes.  The  transform  statistics  can  be 
significantly  altered  by  apodizing.  The  influence  of  a  ten-element 
image  window  is  demonstrated  in  Figure  5.3-5. 

It  should  be  noted  that  the  d.  c.  term  for  the  three-dimensional 
transform  is  located  in  the  first  transform  plane.  For  the  other 
transform  planes,  the  relative  maximum  amplitude  location  is  not 
predictable. 


(a)  Amplitude  (Plane  I) 


(b)  Phase  (Plane  I) 


(c)  Amplitude  (Plane  II) 


(d)  Phase  (Plane  II) 


(e)  Amplitude  (Plane  III)  (f)  Phase  (Plane  III) 

Figure  5.3-1.  Three-Dimensional  Fourier  Transform 
Display  of  Color  GIRL  Image 
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(a)  Apodized;  Bit  Rate  0.65 


!!u'k  of'^r  fS  reProt|uced  at  the 
hack  of  the  report  by  a  different 
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(b)  M.S.E.  1. 9%;  Bit  Rate  0. 55 


(c)  M.S.E.  1. 2%;  Bit  Rate  1.  19 


d)  M.S.E.  4.  5%;  Bit  Rate  0.  54  (d>  M.S.E.  2.  4%;  Bit  Rate  1 . 2 

Figure  5.3-2.  Coding- Decoding  Examples.  The  Mean  Square 
Error  (M.S.E.)  is  Normalized  Relative 
to  Energy  on  Original  Color  Image 
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Tristimulus  Color  Planes  of  Decoded 
GIRL  Image  (0.  62  bit) 


This  page  is  reproduced  at  the 
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better  detail. 


(c)  Amplitude  (Plane  II) 


(e)  Amplitude  (Plane  III) 
Figure  5.3-4 


(f)  Phase  (Plane  III) 
Decoded  Transform  Planes  for  Color  GIRL 
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(a)  Amplitude  (Plane  I) 


(d)  Phase  (Plane  II) 


(c)  Amplitude  (Plane  II) 


(e)  Amplitude  (Plane  III)  (f)  Phase  (Plane  III) 

Figure  5  3-5.  Transform  Domain  Display  Associated 

with  Apodized  Color  Girl  Image  , 
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6.  EXPERIMENTAL  RESULTS  III  (  INTERFRAME  CODINC) 

It  is  obvious  that  significant  redundancy  exists  among  member 
of  image  sequences  representing  the  temporal  variation  of  visual 
scenes.  An  efficient  interframe  coder  removes  the  redundancy  in 
the  sequence  of  similar  images  as  well  as  within  each  image 
(Haskell,  Mounts,  and  Candy  1972). 

In  a  manner  similar  to  the  color  coding  approach  of  the  pre¬ 
vious  chapter,  the  image  sequence  can  be  considered  as  three- 
dimensional  data  consisting  of  two  spatial  coordinates  and  one  time 
coordinate.  An  additional  similarity  between  interframe  and  color 
coding  is  that  they  both  exploit  the  limitation  of  the  human  visual  sys 
tern.  For  most  practical  applications,  three  primary  color  compo¬ 
nents  are  sufficient  to  represent  most  apparent  colors  within  the 
spectral  range  of  the  human  visual  response.  The  limited  temporal 
resolution  of  the  human  eye  permits  the  sampling  of  the  temporal 
variable  at  approximately  60  Hertz. 

It  is  important  to  note  that  the  sampling  procedure  thus 
specified  by  the  inadequacy  of  the  human  visual  process  does  not 
necessarily  correspond  to  the  classical  sampling  requirement. 
Consequently,  emphasis  by  the  coder  is  on  preservation  of  the 
appearance  of  the  image  rather  than  on  the  actual  image  itself. 

There  are  many  applications  for  interframe  coding,  the  most 
obvious  being  television.  Although  various  spatial  domain  techniques 
have  been  successfully  utilized  for  redundancy  reduction  in  video 
signals,  transform  techniques  have  not  been  previously  considered. 
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rhis  chapter  considers  the  theoretical  implications  for 
transform  domain  coding.  The  adaptive  phase  coding  technique  of 
C  hapter  4  is  extended  to  the  interframe  case.  Detailed  discussion  of 

the  algorithm  as  well  as  examples  of  the  coding  procedures  are  also 
given. 

6.  1  Analysis  of  the  Interframe  Case 

A  sequence  of  images  can  formally  be  written  as  I  (x,  y,  t).  The 
spatial  variation  is  indicated  by  x  and  y  and  the  temporal  variation  by 
t.  The  three-dimensional  function  I  represents  the  continuous  (non¬ 
discrete)  variation  of  a  visual  scene.  The  physical  nature  of  the 
imaging  process  requires  that  I(x,y,t)  be  non-negative. 

If  I  is  band-limited  with  respect  to  all  three  of  its  variables, 
then  the  Lukosz  bound  applies,  at  least  formally.  In  fact,  the  dis¬ 
cussion  in  subsection  2.  6  indicated  the  tightening  of  this  bound  for 
increasing  numbers  of  dimensions.  If  the  Lukosz  bound  is  to  be  valid 
for  the  sampled  version  of  I,  the  sampling  rate  must  be  at  least  twice 
the  band  limit  for  each  dimension.  The  various  imaging  devices 
band-hrmt  the  spatial  frequency  spectrum  of  images;  however,  no 
similar  band-limiting  occurs  for  the  temporal  variation.  Further, 
sampling  along  the  time  axis  is  performed  to  match  the  limitations  of 
human  visual  process  and  bears  no  relation  to  the  structural  form  of 
the  actual  image.  Consequently,  the  Lukosz  bound  does  not  apply  for 
the  temporal  portion  of  interframe  imagery. 

The  utility  of  statistical  coding  should  also  be  discussed  as 
applicable  to  the  interframe  case.  Statistical  coding  procedures 
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utilize  the  stochastic  rule  that  exists  among  the  elements  that  are 
to  be  coded 

The  existence  of  strong  nonstochastic  dependence  among  ele¬ 
ments  of  a  multidimensional  image  implies  that  a  purely  statistical 
approach  to  image  coding  is  suboptimal.  The  intraframe  and  color 
images  can  be  sufficiently  characterized  by  statistical  means.  The 
statistical  approach  can  be  extended  to  the  interframe  case,  (as  is 
done  in  the  remaining  subsections  of  this  chapter).  However,  it  is 
interesting  to  note  various  deterministic  relations  which  apply  to  the 
interframe  case  and  which  are  ignored  by  the  statistical  approach. 

The  above  indicated  deterministic  rules  can  be  formally  repre¬ 
sented  by  operator  notation.  Let  I.  (x,  y)  =  I  (x,  y,  t.)  and  I  (x,  y)  = 

J  j  k 

Hx.y.t^)  be  two  individual  images  in  a  sequence  of  images  describing 

a  time-dependent  visual  scene.  Specifically,  I.  and  I  represent  the 

J  k 

image  at  times  t  and  t^.  The  following  specific  question  should  be 
addressed:  given  the  image  pair  L  and  1^,  is  there  a  nonstochastic 
operator  L,  such  that,  at  least  approximately, 

IjJx.y)  =  L  jl  (x,  y)  j  (6.  1-1) 

Any  statistical  coding  approach  which  ignores  Equation  (6.  1-1) 
and  the  inherent  redundancy  it  implies  cannot  be  optimal. 

In  the  following,  the  various  basic  forms  of  the  operator  L  are 
considered.  The  appropriate  influence  in  the  transform  domain  are 
addressed. 
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The  operator  L  can  represent  local  as  well  as  global  changes 

between  I.  and  I,.  The  former  case  implies  that  only  a  relatively 
J  k 

small  part  of  the  image  is  changing.  Under  global  changes,  the  entire 
image  is  understood  to  be  changing  in  some  (nonstochastic)  systematic 
manner.  The  local  image  variation  indicates  the  temporal  evolution 
of  a  visual  scene  as  observed  by  a  stationary  imaging  device.  Global 
image  variation  is  the  probable  result  of  the  movement  of  the  appro¬ 
priate  imaging  device. 

Some  of  the  obvious  global  variations  are  image  shift,  rotation, 
defocus  and  magnification.  Other  more  complicated  global  image 
changes  as  well  as  the  simultaneous  occurrence  of  the  ones  listed 
above  clearly  are  possible.  These  global  image  variations  cannot  be 
characterized  statistically;  thus,  the  statistical  encoder  is  not  likely 
to  remove  the  entire  redundancy  which  is  present  in  interframe 
imagery. 

The  extension  of  the  adaptive  phase  coder  of  Chapter  4  to  the 
interframe  case  is  likely  to  be  sub -optimal  because  of  the  statistical 
approach  taken.  For  local  variations  and/or  small  global  changes, 
the  statistical  correlation  among  neighboring  frames  is  relatively 
high,  thus  utilization  of  the  statistical  approach  will  lead  to  modest 
bandwidth  reduction  over  the  intraframe  approach. 

Appropriate  changes  in  the  transform  domain,  resulting  from 
the  affects  of  the  operator  L  in  Equation  (6.  1-1),  can  be  modeled  by 
the  use  of  simplified  examples. 
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Local  Variations 

Consider  the  following  case  of  frame -to -frame  change  as 
indicated  in  Figure  6.  1-1.  Let  g^lx.y)  represent  a  subregion  in 
frame  A  which  is  shifted  a  distance  a  in  the  horizontal  direction  by 
the  time  that  frame  B  is  generated.  The  unchanging  background  is 
represented  by  gj(x,y).  The  altered  parts  of  the  background  are 
denoted  by  g2(x,  y)  and  g4(x,  y).  In  frame  A,  g2(x,  y)  is  part  of  the 
frame  while  g4(x,  y)  is  covered  by  g3(x,  y).  The  roles  of  g2(x,y)  and 
g4(x,  y)  are  interchanged  in  frame  B.  Equivalently,  this  can  be 
expressed  as 

gA  =  gl  +  g2  +  83  (6-  1_2) 

gB  =  gl  +  g3(x  +  a'  y)  +  84  (6.  1-3) 

Here  gA  and  gB  rePresent  frames  A  and  B.  Note  also  that  the  argu¬ 
ment  (x.y)  is  omitted  for  notational  convenience.  Although  Equations 
(6.  1-2)  and  (6.  1-3)  model  a  rather  simplified  interframe  change,  it 
approximates  actual  applications  such  as  the  Picturphone  model.  For 
the  latter  case,  g^  can  be  considered  as  the  model  for  the  Picturphone 
speaker  and  gj,  g2  and  g4  represent  the  various  background 
segments. 

In  terms  of  the  previously  developed  notation,  the  frame -to - 
frame  change  can  be  made  in  a  simpler  form.  First,  the  following 
additional  definitions  are  made 


(6.  1-4) 
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8b  =  g3(x  +  a)  +  g4  (6.  i_5) 

therefore 


+  ea 

(6.  1-6) 

+  *b 

(6.  1-7) 

Equations  (6.  1-6)  and  (6.  1-7)  indicate  that,  for  the  model  under 
discussion,  each  frame  can  be  decomposed  into  a  varying  part  and 


one  that  remains  unaltered  between  consecutive  frames. 

Specifying  the  discussion  to  the  Fourier  transform,  the  above 

described  model  permits  qualitative  predictions  for  inerframe 

changes  in  the  frequency  domain.  Let  G  (u,  v)  be  the  Fourier  trans- 

s 

form  of  gs(x,  y).  The  subscript  s  may  represent  any  of  those  pre¬ 
viously  utilized:  1,2,  3,4,  a,  b,  A,  and  B.  Consequently, 


00 

Gs(u,  v)  =  Gg  =  J  gg(x,  y)  exp  .  2nj(ux  +  vy)  dxdy 


(6.  1-8) 

exPjes  =gsR  +jo8l 

(6.  1-9) 

(6.  1-10) 

Note  that  the  subscripts  (u,v)  are  dropped  whenever  possible  for 
notational  convenience. 
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Using  the  developed  notation  for  the  frequency  domain,  the 
Fourier  transforms  of  Equations  (6.  1-6)  and  (6.  1-7)  can  be  written 
as 


G  .  =  G .  +  G  (6.  1-11) 

Ala 

gb  =  Gi  +  Gb  (6.  1-12) 


The  interframe  variations  in  the  frequency  plane  given  by 
Equations  (6 .  1-11)  and  (6.  1-12)  can  be  demonstrated  by  phasor  dia¬ 
grams.  Simultaneous  display  of  Equations  (6.1-11)  and  (6.  1-  12)  is 
shown  in  Figure  6.  l-2a.  This  figure  indicates  a  "typical"  example 
of  the  interframe  variation  and  should  only  be  viewed  as  a  qualita¬ 
tively  demonstrative  example.  The  following  assumptions  are  also 
inherent  in  this  graphical  demonstration:  (1)  g^,  g&,  and  g^  have 
"similar"  Fourier  decompositions,  (2)  the  region  over  which  gj  is 
defined  is  larger  than  the  similarly- specified  regions  for  g&  and  g^. 
The  above  assumptions  imply  that  the  power  spectral  density  functions 
are  similar  for  g^,  g&  and  g^  except  for  different  scale  factors. 

The  graphical  representation  implies  that  both  the  amplitude 
and  phase  values  are  strongly  correlated.  Furthermore,  the  follow¬ 
ing  inequalities  for  phase  and  amplitude  changes  are  easily  obtained 
from  Figure  6.  l-2a. 


1**1  =  l*A  -  S 


tan 


-i  °b 


tan 


i  G 

- 1  a 


(6. 1-13) 


(a)  Phasor  Representation  of  Interframe  Changes 
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The  conditions  for  equality  in  Equations  (6.  1-13)  and  (6.  1-14) 
are  illustrated  in  Figures  6.  l-2b  and  c. 

As  a  further  demonstration  of  the  phase  and  amplitude  corre¬ 
lation,  quantitative  evaluation  of  Equations  (6.  1-13)  and  (6.  1-14)  can 

bo  made  by  substituting  "reasonable"  values  for  G.,  G  ,  and  G,  .  In 

la  b 

particular,  let  |Gj|  =  =  | |  .  This  condition  implies  that  the 

areal  extent  of  the  changing  and  unchanging  image  segments  are 
equal,  with  similar  power  spectral  densities.  Although  the  amplitude 
change  constraint  is  not  significant,  as  indicated  by  Equation  (6.  1-14), 
the  maximum  phase  change  is  restricted  to  tt/2  which  is  a  fourfold 
reduction  on  the  phase  range. 

For  a  10  percent  image  area  change,  and  similar  assumptions 
as  before,  the  application  of  Equation  (6.  1-14)  indicates  a  phase 
change  of  less  than  tt/15  which  is  a  reduction  of  the  phase  range  by 
a  factor  of  30. 

6.  2  Description  of  *he  Algorithm 

The  extension  to  the  interframe  case  of  the  adaptive  algorithm 
of  Chapter  4  is  structurally  quite  similar  to  the  color  coding  imple¬ 
mentation.  The  coding  procedure  utilizes  the  three-dimensional 
transform  (Fourier  or  Walsh)  to  uncorrelate  a  set  of  four  subsequent 
image  frames.  A  schematic  diagram  similar  to  the  one  given  for  the 
color  coder  is  shown  in  Figure  6.2-1. 
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igure  6.2-1.  Schematic  of  Docing- Decoding  Process  for  the  Interframe  Ca 


The  three-dimensional  transformation  consists  of  the  subse¬ 
quent  application  of  the  two-dimensional  256  x  256  transform  of  each 
image  plane  and  the  one  dimensional  four -point  transform  along  the 
temporal  axis.  The  four-point  transform  matrices  are  shown  in 
Figures  6.2-2  and  6.2-3  for  the  Fourier  and  Walsh  matrices, 
res  pectively. 

By  assumption,  the  four  transform  planes  are  uncorrelated 
individually  as  well  as  relative  to  each  other.  After  the  application 
of  the  three-dimensional  transform  (either  Fourier  or  Walsh)  the 
first  transform  plane  is  the  average  of  the  two-dimensional  trans¬ 
forms  of  the  four  input  images.  The  other  three  transform  planes 
represent  fluctuations  around  the  average.  One  could  qualitatively 
argue  that  the  first  transform  plane  represents  the  unchanging  image 
segment  while  the  other  three  transform  planes  contain  information 
relating  to  temporal  variation. 

Like  the  color  coder  of  Chapter  5,  each  of  the  four  transform 
planes  is  individually  filtered  and  coded.  Unlike,  however,  the  color 
coding  procedure,  one  cannot  arbitrarily  low-pass  filter  transform 
planes  2  through  4.  Drastic  low-pass  filtering  of  these  planes  will 
result  in  the  blurring  of  the  time -varying  areas  without  reducing  the 
resolution  of  unchanging  areas. 

It  has  been  demonstrated  (Budrikas,  1972)  that  the  resolution 
loss  in  rapidly  changing  areas  is  visually  much  less  objectionable 
than  for  image  segments  that  are  relatively  stationary.  By 
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Figure  6.2-2.  Four-Element  Fourier  Transform 
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appropriately  "tuning"  the  four  two-dimensional  filters  in  Figure 
6.2-2,  the  psychophysical  properties  of  the  human  visual  system 
could  be  exploited.  Although  the  computer -implemented  algorithm 
of  this  chapter  could  be  utilized  for  the  study  of  the  relative  impor¬ 
tance  of  resolution  loss  in  moving  and  stationary  image  segments, 
it  was  not  done  experimentally.  The  unavailability  to  this  research 
effort  of  the  hardware  required  to  display  the  decoded  interframe 
images  in  their  natural  medium  (such  as  television)  restricted  the 
visual  evaluation  of  the  decoded  image  sequences  to  the  viewing  of 
individual  (stationary)  images. 

The  structure  of  both  the  Fourier  and  Walsh  transform 
matrices  indicates  that  for  statistically  correlated  image  frames 
the  image  energy  will  concentrate  in  the  first  transform  plane. 
Therefore,  even  without  the  application  of  different  spatial  filters 
to  the  various  transform  planes,  the  adaptive  procedure  will  result 
in  bandwidth  reduction.  The  transform  values  in  transform  planes 
2  through  4  will  require  fewer  quantization  levels  because  of  uneven 
energy  distribution. 

The  advantages  of  the  adaptive  phase  coding  procedure  indicated 
in  Chapters  4  and  5,  are  applicable  to  the  interframe  coder  as  well. 
Specifically,  the  coder  will  "track"  the  three-dimensional  power 
spectrum  and  make  the  bit  assignment  adaptively.  The  number  and 
location  of  quantum  levels  will  be  specified  according  to  the  local 
estimated  value  of  the  power  spectral  density.  The  adaptivity  feature 
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has  an  added  benefit  for  the  interframe  transform  coder.  Unlike  the 
monochrome  exponential  correlation  model,  the  interframe  case 
cannot  be  modeled  by  a  simple  correlation  function.  In  fact,  the 
highly  non -stationary  nature  of  the  interframe  image  precludes  any 
fixed  nonaduptive  modeling  of  the  transform  domain. 

6.  3  Pictorial  Examples 

Examples  for  the  three-dimensional  Fourier  and  Walsh  inter¬ 
frame  coder  are  shown  in  this  subsection.  Figure  6.3-1  is  the 
three-dimensional  transform  domain  display.  The  coding  examples 
are  given  in  Figures  6.  3-2  through  6.  3-9.  An  example  for  the 
decoded  transform  planes  is  given  in  Figure  6.  3-10. 

The  visual  inspection  of  Figures  6.3-1  and  6.3-10  demonstrate 
the  non -stationary "  character  of  the  three-dimensional  transform 
for  the  inte.rframe  case  and  the  capability  of  the  coding  method  to 
adapt  to  the  particular  form.  The  structure  of  transform  planes 
2  through  4  is  the  result  of  the  significant  amount  of  image  motion 
in  this  example. 

The  Fourier  transform  coder  similarly  to  the  monochrome 
case  outperforms  the  Walsh  coder  both  in  terms  of  mean  square 
error  as  well  as  visual  appearance. 
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(c)  Third  Plane  (d)  Fourth  Plane 


Figure  6.3-1.  Three-Dimensional  Fourier  Transform  Display 
of  BELL  DUMMY  Image  Sequence 


This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
}»etter  detail. 


L 
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Figure  6.  3-2.  Decoded  BELL-GIRL  Image  Sequence  I  (Fourier). 
Bit  Rate:  0.  27  Bit;  M.  S.  E.  :  1 . 79%  (normalized 

relative  to  image  energy) 


This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 


(a)  First  Image 


(b)  Second  Image 


(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-3.  Decoded  BELL-GIRL  Image  Sequence,  II  (Fourier) 
Bit  Rate  0. 55  Bit;  M.  S.  E.  :  0. 99%  (normalized 
relative  to  image  energy) 

This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 
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(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-4.  Decoded  BELL-GIRL  Image  Sequence, 
III  (Walsh).  Bit  Rate  0.  38  Bit;  M.S.  E.  :  2.24% 

(Normalized  Relative  to  Image  Energy) 

This  page  is  reproduced  at  the 
hack  of  the  report  In  a  different 
reproduction  method  to  provide 
Iretter  detail. 


L 


(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-5.  Decoded  BELL-GIRL  Image  Sequence, 
IV  (Walsh).  Bit  Rate  0.68  Bit,  M.S.  E.  :  2.15% 
(Normalized  Relative  to  Image  Energy) 

This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 
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(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-6.  Decoded  BELL- DUMMY  Image  Sequence, 
I  (Fourier).  Bit  Rate:  0.46  Bit; 

M.S.E.:  1.47%  (normalized 

relative  to  image  energy) 

This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 


(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-7.  Decoded  BELL-DUMMY  Image  Sequence, 
II  (Fourier).  Bit  Rate:  0 . 43  Bit;  M .  S .  E.  :  1.05?, 
(normalized  relative  to  image  energy) 

This  page  is  reproduced  at  the 
hack  of  the  report  l>\  a  different 
reproduction  method  to  provide 
Iretter  detail. 
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(a)  First  Image 


(b)  Second  Image 


(c)  Third  Image  (d)  Fourth  Image 


Figure  6.3-8.  Decoded  BELL-DUMMY  Image  Sequence, 
III  (Walsh).  Bit  Rate:  0.  29  Bit;  M.  S.  E.  :  2.  247o 
(normalized  relative  to  image  energy) 


This  page  is  reproduced  at  the 
back  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 
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(c)  Third  Image 


(d)  Fourth  Image 


I'iguro  6.^-9.  Decoded  BELL-DUMMY  Image  Sequence, 
IV  (Walsh).  Bit  Rate  0.  69  Bit;  M.S.  E.  :  1.69% 
(normrli/.ed  relative  to  image  energy) 


This  page  is  reproduced  nt  the 
l>ack  of  the  report  by  a  different 
reproduction  method  to  provide 
letter  detail. 


(a)  First  Plane 


(b)  Second  Plane 


(c)  Third  Plane  (d)  Fourth  Plane 


Figure  6.  3-10.  Decoded  Transform  Planes  for  BELL  DUMMY 

Image  Sequence 


This  page  is  reproduced  at  the 
track  of  the  report  hv  a  different 
reproduction  method  to  provide 
la  tter  detail. 


7.  SUMMARY 


A  new  approach  to  transform  image  coding  has  been  presented 
in  this  dissertation.  The  generalized  phase  concept  plays  a  dominant 
role  in  currently  developed  coding  algorithms.  The  important  advan¬ 
tage  of  the  coding  algorithms  is  a  high  degree  of  adaptivity  in  the 
'  etermination  of  both  the  number  and  location  of  the  quantum  levels. 

Significant  adaptivity  to  the  image  power  spectral  density  is 
accomplished  without  the  assumption  or  specification  of  any  a  priori 
statistical  image  model.  In  addition,  no  bookkeeping  information  is 
required.  The  actual  image  model  is  "dynamically"  determined  from 
previously  quantized  and  reconstructed  transform  samples. 

The  new  transform  coding  approach  was  implemented  through 
discrete  Walsh  and  Fourier  transforms.  The  Fourier  transform  was 
found  to  be  superior  to  the  Walsh  transform.  The  fundamental  supe¬ 
riority  of  the  Fourier  transform  is  explained  by  the  general  image 
insensitivity  to  (frequency  domain)  low-pass  filtering. 

Although  the  image  transform  is  performed  on  the  entire 
(256  X  256)  image  rather  than  on  smaller  blocks,  the  increase  in  com¬ 
putational  complexity  is  modest.  For  example,  the  number  of  com¬ 
putational  steps  will  only  increase  a  factor  of  two  from  16  X  16  block 
transforms  to  the  entire  256  X  256  transform.  The  large-size  trans¬ 
form  is  a  disadvantage  if  a  hard-wired  configuration  is  required, 
however,  this  fact  is  unimportant  when  the  coding-decoding  algorithm 
is  implemented  via  general  purpose  computers.  This  latter  case  has 
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the  potential  practical  utility  for  computer-to-computer  image 
transmission. 

The  new  image  coding  techniques  utilizing  large-s'ze  trans¬ 
forms  significantly  outperform  the  block- encoding  transform  tech¬ 
nique.  The  usual  block  size  (16  X  16)  exceeds  the  number  of  picture 
elements  over  which  the  image  is  significantly  correlated;  it  was 
therefore  oreviously  postulated  that  larger-size  transform  blocks 
may  result  in  negligible  performance  improvement.  However,  block 
encoding,  particularly  at  low  data  rates,  assigns  a  significant  fraction 
of  the  available  bits  for  reconstruction  of  block- to -block  boundaries. 
Stated  in  another  way,  the  image  statistics  are  significantly  altered 
by  grouping  into  adjacent  image  blocks.  Discovery  and  analysis  of 
this  fact  provides  a  sound  theoretical  basis  for  the  experimental 
success  of  the  coding  algorithms  in  this  dissertation. 

The  experimental  portion  of  the  dissertation  includes  coding 
algorithms  for  monochrome,  color,  and  interframe  images.  It  has 
been  found  that  the  data  rate  can  decrease  to  0.  38  bit  for  monochrome, 
0.  55  bit  for  color  and  0. 25  bit  for  interframe  images.  The  imple¬ 
mentation  included  both  the  Fourier  and  Walsh  transforms.  Visual 
image  degradation,  however,  was  more  significant  for  the  Walsh  than 
for  the  Fourier  transform. 

The  coding  scheme  is  susceptible  to  channel  errors.  It  was 
shown  that  the  coder  output  is  statistically  equivalent  to  a  discrete 
memoryless  source,  thus,  conventional  channel  encoding  techniques 
are  applicable.  The  coding  procedure  is  capable  for  a  wide  range  of 
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data  compression  thus  the  requirement  for 
(channel  coding)  can  be  offset  by  additional 


algebraic  redui  dancy 
image  data  compression 


(source  coding). 

The  non-negative 


image  constraint  has  been  studied  via  the 


Lukosz  bound. 


The  conclusions  of  this  dissertation  are 

a)  Determination  of  the  proper  transform  domain  image 
model  is  important. 

b)  Utilization  of  large -size  transforms  and  adaptive  phase 
coding  permits  significant  additional  rate  reduction  when 
comparison  is  made  with  bio  k  encoding. 

c)  The  superiority  of  phase  has  been  demonstrated  as  a 
random  variable  for  coding. 

d)  The  development  of  improved  predicting  algorithms  and 
preprocessing  filters  may  result  in  additional  bandwidth 
reduction.  The  polynomial  surface  fit  algorithm,  in 
addition,  could  be  utilized  for  the  image  model. 

e)  Adaptivity  is  important  to  deal  with  non -stationary  image 
structure,  particularly  for  the  interframe  case,  and 
residual  noise.  The  latter  consideration  was  shown  to 
be  important  for  most  practical  situations. 


APPENDIX  A 


ORIGINAL  TEST  IMAGES 


Various  test  images  used  in  experimental  sections  of  this 
dissertation  are  shown  in  this  appendix.  Figure  A-l  shows  redisplay 
of  the  two  original  monochrome  images.  Color  test  images  are 
shown  in  Figure  A-2.  Their  three  primary  components  (tristimulus 
values  in  the  NTSC  receiver  phosphor  primary  system)  are  shown  in 
Figure  A-3  in  monochrome  presentations.  Two  image  sequences 
used  for  interframe  coding  are  shown  in  Figures  A-4  and  A-5. 

Image  differences  for  these  sequences  are  presented  in  Figure  A-6. 

The  monochrome  (Figure  A-l)  and  color  test  images  (Figure 
A-2)  were  obtained  by  digitization  of  photographic  transparencies. 
The  image  sequences  for  the  interframe  case  were  obtained  from 
digitized  video  signal.  All  sampled  images  consist  of  256  X  256 
picture  elements  and  each  original  sample  is  uniformly  quantized  to 
256  levels  (8  bits).  The  monochrome  images  were  displayed  on  a 
flying  spot  scanner  and  photographed  on  Polaroid-type  52  film.  The 
color  images  were  displayed  on  the  Aerojet  Model  SG-D2219  video 
display  and  photographed  on  high-speed  Ektachrome  film. 
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1 11 


(a)  Girl 


(b)  Couple 


This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
better  detail. 


Figure  A- 1 .  Monochrome  Test  Images 


(a)  Girl 


(b)  Couple 


Figure  A-2.  Color  Test  Images 


This  page  is  reproduced  at  the 
hack  of  the  report  by  a  different 
reproduction  method  to  provide 
better  detail. 


Primary  Component  Images,  (a)  Red, 

(b)  Green,  (c)  Blue 

This  page  is  reproduced  at  the 
hack  of  the  report  hv  a  different 
reproduction  method  to  provide 
better  detail. 
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(a)  First  Image 


(b)  Second  Image 


Figure  A-5.  Image  Sequence:  BELL  DUMMY 


This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
better  detail. 


(a)  Difference  Between  First  and  Second  Image 


(b)  Difference  Between  Second  and  Third  Image 


Figure  A-6.  Absolute  Image  Differences  Among  Consecutive 
Images  in  Image  Sequences  of 
Figures  A-4  and  A-5 


This  page  is  reproduced  at  the 
hack  of  the  report  bv  a  different 
reproduction  method  to  provide 
better  detail. 


APPENDIX  B 


NUMERICAL  NOISE  GENERATED  BY  LARGE- 
SIZE  FOURIER  TRANSFORMS 

Results  of  a  simple  computer  experiment  are  shown  in  this 
appendix,  to  demonstrate  that  large  (256  X  256)  numerical  Fourier 
transforms  are  expected  to  generate  a  negligible  amount  of  numerical 
noise.  The  "girl"  image  was  Fourier-transformed  and  then  the 
result  inverse- transformed  and  the  appropriate  mean  squared  error 
was  calculated.  This  cycle  was  repeated  two  more  times  on  the 
retransformed  images.  The  results  are  shown  in  Table  B-l.  All 
calculations  were  performed  on  an  IBM  360/44  computer  with  single 
precision  (32  bit)  arithmetic. 

TABLE  B-l 

DEMONSTRATION  OF  FOURIFR  TRANSFORM¬ 
GENERATED  NUMERICAL  NOISE 

Cycle  MSE 

t  9  X  10" 8% 

2  49  X  10" 8% 

3  100  X  10" 8% 

Table  B-l  demonstrates  that  the  large-size  transform-generated 

generated  numerical  noise  probably  will  have  negligible  influence  on 
image  coding  problems. 
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